MySQL is a relational database management system based on SQL – Structured Query Language. The application is used for a wide range of purposes, including data warehousing, e-commerce, and logging applications.
In a YAML file, the config
section contains the following properties:
Connectorname: MySQL
host: Hostname or IP address of the server
port: Server running port
username: Username
password: Password
database: Database
drivername: mysql+pymysql
Data Hub
icon on the Navigation Pane.Add Project
and provide the new project’s name.MySQL
template.Parameters | Description |
---|---|
Host: | Specify the hostname of the MySQL server. |
Port: | Specify the port number of the MySQL server (default is 3306). |
Username: | Provide the username to authenticate with the MySQL server. |
Password: | Provide the password to authenticate with the MySQL server. |
Database: | Specify the name of the MySQL database from which data will be extracted. |
Driver Name: | Specify the driver name for connecting to MySQL (e.g., mysql+pymysql). |
Select: | Tablename(s): Specify the table name list to load tables from the MySQL server. |
Metadata: (Optional) | Replication Method: Specify the replication method for the table(s). Options include FULL or INCREMENTAL. |
Replication Key: Specify the replication key for incremental replication. This key helps in identifying new or updated records. | |
Replication Value: Specify the replication value to start the incremental replication from a particular point. |
Schedules
and select the created mysql
project.Run Now
.Schedule
option to schedule the refresh hourly.Edit DataSource
Option to view the created table(s), such as the ‘votes’ table.In the metadata section, define the mode of data refresh. There are two modes: INCREMENTAL and FULL_TABLE. It only supports DateTime datatype columns.
This mode fetches data from the date column mentioned in the replication key from the start date as mentioned in the replication value. Once it is scheduled, the replication value is updated automatically from the imported data.
metadata:
TableName:
replication_method: INCREMENTAL
replication_key: Column name
replication_value: column value that data starts from
This mode fetches data from the date column mentioned in the replication key from the start date as mentioned in the replication value. Once it is scheduled, the replication value is updated based on the interval_type and interval_value from the imported data. For ex set interval_type as ‘year’ and intervalue value as ‘1’.In first schedule, will fetch the record from Jan 1, 2000 to Dec 31, 2000. In next schedule, will fetch the record from Jan 1, 2001 to Dec 31, 2001 and so on.
metadata:
TableName:
replication_method: FULL_TABLE
replication_key: Column name
replication_value: column value that data starts from
interval_type: days/hours/minutes/year/month
interval_value: integer value to add in interval type
version: 1
encrypt_credentials: false
plugins:
extractors:
- name: tap_postgres
connectorname: MySQL
config:
host: Hostname or IP address of the server
port: Server running port
username: Username
password: Password
database: Database
drivername: mysql+pymysql
select:
- TABLE1
- TABLE2
metadata:
TABLE1:
replication_method: INCREMENTAL
replication_key: last_modified_on
replication_value: 2023-07-19 00:00:00
TABLE2:
replication_method: INCREMENTAL
replication_key: last_modified_on
replication_value: 2023-07-19 00:00:00
version: 1
encrypt_credentials: false
plugins:
extractors:
- name: tap_postgres
connectorname: MySQL
config:
host: Hostname or IP address of the server
port: Server running port
username: Username
password: Password
database: Database
drivername: mysql+pymysql
select:
- TABLE1
- TABLE2
metadata:
TABLE1:
replication_method: FULL_TABLE
replication_key: last_modified_on
replication_value: 2023-07-19 00:00:00
interval_type: days
interval_value: 6
TABLE2:
replication_method: FULL_TABLE
replication_key: last_modified_on
replication_value: 2023-07-19 00:00:00
interval_type: days
interval_value: 6