With Databand, you can track the execution of your DataStage jobs. This is achieved through the use of a syncer that will scan your DataStage project every few seconds and report on collected metadata from any jobs that have been run. With the metadata collected, you can enable powerful alerting to notify your data team on the health of your jobs and the quality of your inputs and outputs.
Databand natively offers the following alerts for your DataStage job runs:
- Run and task state alerts (e.g. running, successful, failed, etc.)
- Run duration alerts
- anomaly detection
- percent ranges (e.g. duration within 20% of 100 seconds)
- basic comparison operators (e.g. duration > 100 seconds)
- Schema changes for inputs and outputs
- new columns added
- old columns removed
- datatypes of existing columns changed
- Record counts for inputs and outputs
- anomaly detection
- percent ranges (e.g. record count within 20% of 100,000 rows)
- basic comparison operators (e.g. record count > 100,000 rows)
To begin monitoring your DataStage project in Databand, start by creating a DataStage syncer in the Databand UI:
Click on Integrations in the lefthand menu.
Click the Connect button under DataStage.
In the syncer configuration, provide the following details:
Source name - This will become the name of your DataStage syncer in the Databand UI and will allow you to filter flows based on their DataStage projects.
Project ID - The ID of the DataStage project you would like to monitor. The project ID can be found in the URL of your DataStage project.
API key - The API key will allow Databand to authenticate with your DataStage project. To generate a new API key for your user identity, follow these steps outlined in the IBM documentation.
- Hostname - The hostname for an on-prem deployment of DataStage.
- IAM Service URL - The hostname for an on-prem IAM authentication service of DataStage.
- Number of threads - The number of concurrent threads to use on the DataStage API client. The default recommended value is 2.
After providing the required parameters, click Save.
Once these steps have been completed, the next time a job runs in your DataStage project, you will see it in your Databand UI. The name of your job in the DataStage UI will become the pipeline name in the Databand UI.
- Click on Settings in the lefthand menu.
- Click on Datasource Syncers in the settings menu.
- Click the button in the Actions column, and select Edit from the context menu.
- Make the necessary changes in the syncer configuration, and then click the Save button.
Databand will collect high level information about the execution of your DataStage jobs as well as general information about the inputs and outputs of your stages. The metadata collected includes the following:
- Each syncer only supports a single DataStage project. Soon, users will be able to add multiple projects to a single syncer.
- Subflows and custom steps are not yet supported.
Updated about 1 month ago