Data Import Documentation¶
Overview¶
The Data Import module of the DataAnalysisToolkit provides functionalities to import data from various sources such as Excel files, SQL databases, and APIs. It is designed to simplify the process of data collection and integration for analysis and machine learning tasks.
Features¶
Excel Connector: Import data from Excel files (.xlsx, .xls).
SQL Connector: Connect and import data from SQL databases like MySQL, PostgreSQL, etc.
API Connector: Fetch data from various APIs with handling for authentication and rate-limiting.
Data Integrator: Merge or concatenate data from different sources into a unified DataFrame.
Data Formatter: Standardize and transform the imported data into a consistent format.
Getting Started¶
Excel Connector¶
To import data from Excel files:
from data_sources.excel_connector import ExcelConnector
connector = ExcelConnector('path/to/excel/file.xlsx')
data = connector.load_data(sheet_name='Sheet1')
SQL Connector¶
For SQL databases:
from data_sources.sql_connector import SQLConnector
connector = SQLConnector('database_URI')
data = connector.query_data('SELECT * FROM table_name')
API Connector¶
To fetch data from an API:
from data_sources.api_connector import APIConnector
connector = APIConnector('https://api.example.com', auth=('username', 'password'))
response = connector.get('endpoint')
Data Integrator¶
Merge or concatenate data from multiple sources:
from integrators.data_integrator import DataIntegrator
integrator = DataIntegrator()
integrator.add_data(data_from_excel)
integrator.add_data(data_from_sql)
combined_data = integrator.concatenate_data()
Data Formatter¶
Standardize or transform the data:
from formatters.data_formatter import DataFormatter
formatter = DataFormatter(combined_data)
formatter.standardize_dates('date_column')
formatter.normalize_numeric(['numeric_column'])
Error Handling¶
The toolkit includes error handling for common issues encountered during data import, such as file not found, invalid format, or connection issues. Ensure to handle exceptions in your implementation to maintain robustness.
Examples¶
Refer to the examples
directory for detailed examples of using each connector and integrating data from multiple sources.
Contribution¶
Contributions to enhance the data import module, such as adding new connectors or improving existing functionalities, are welcome. Please refer to the contribution guidelines for more details.