IBM Db2 additional information
Delphix Virtualization glossary
Rollforward to Point-in-Time (RFPIT)
Since Delphix virtualization doesn’t support VDB point-in-time for DB2, the DB2 connector provides a feature called dSource Rollforward to Point-in-Time (or dSource Timestamp) where the customer can rollforward the dSource to a particular point-in-time so that they can provision a VDB out of the dSource snapshot where the database was rollforwarded to a specific timestamp thereby providing the functionality of Point-in-time VDB provisioning.
Support for BLU or column-organized tables with DB2 connector
DB2 BLU Acceleration is a feature of IBM's DB2 relational database management system that is designed to deliver high performance for analytical workloads.
It is a high-performance, scalable, and easy-to-use platform for running analytical workloads on large datasets to derive insights from the data quickly and efficiently.
Based on a single registry setting, DB2_WORKLOAD=ANALYTICS
, DB2 automatically adapts resources, configuration, workload management, and storage parameters to optimize resource consumption for the platform you are running on.
BLU Acceleration reuses the same SQL compiler and optimizer. Most utilities—including LOAD, INGEST, EXPORT, BACKUP and RESTORE, ROLLFORWARD, and many others—work as usual. In addition there is the ability to mix row-organized and column-organized tables in the same storage (tablespace), bufferpool, schema, and even within the same SQL statement.
Prerequisites in Staging and target environments -
-
Users must set the DB2 registry variable to
DB2_WORKLOAD=ANALYTICS
before dsource/VDB dataset workflows. -
Users must make sure memory settings are set according to the target server’s instance memory.
-
Restore operation restores the database configuration file from the backup image.
For more information, read IBM documentation.
IBM Db2 glossary
Db2 tablespace Container
In the db2 connector context, DB2 uses containers to physically store data. A container can be a file, a directory, or a raw device. Since containers are objects that actually exist on a disk or a raw device, they are considered as physical database objects. In contrast to containers, tablespaces are logical database objects.
IBM Db2 authentication
Db2 User and group authentication is managed in a facility external to Db2 LUW, such as the operating system, a domain controller, or a Kerberos security system. This is different from other database management systems (DBMSs), such as Oracle and SQL Server, where user accounts may be defined and authenticated in the database itself, as well as in an external facility such as the operating system.
Any time a user ID and password is explicitly provided to Db2 LUW as part of an instance attachment or database connection request, Db2 attempts to authenticate that user ID and password using this external security facility. If no user ID or password is provided with the request, Db2 implicitly uses the user ID and password that were used to login to the workstation where the request originated. More information on Db2 authentication and authorization is available via IBM documentation.
Delphix Db2 authentication
Delphix for Db2 requires that the staging and target hosts must already have the necessary users and authentication systems created/installed on them. Delphix will neither create users nor change database passwords as part of the provisioning process.
Db2 database level support
Delphix supports a number of Db2 database-level features. An overview of Db2 database level support is as follows:
-
Support for multiple databases linking in a Single Instance, which allows Delphix Engine users to utilize an available instance on the target more efficiently.
-
Support for using an end user supplied directory for the Delphix Connector, Db2 Mount Points, and Db2 Delphix files and logs.
-
Support for Kerberos environments.
-
Support for VDB provisioning of the same OS version level or one version higher than the source and staging instances.
-
Intelligent handling of HADR logs.
High Availability Disaster Recovery (HADR)
The HADR feature of IBM Db2 provides a high availability solution for both partial and complete site failures. It protects against data loss by replicating data changes from a source database, called the primary, to one or more target databases, called the standby.
Delphix HADR support
HADR replication takes place at a database level, not at the instance level. Therefore, a standby instance can have multiple databases from multiple different primary servers/instances on it. If the instance ID on the Delphix standby is not the same as the instance ID on the primary, the Delphix standby instance ID must have database permissions secadm and dbadm granted to it on the primary database. These permissions and all HADR settings must be implemented on the primary database before you take the backup on the primary database.
Log transmitting
All changes that take place at the primary database server are written into log files. The individual log records within the log files are then transmitted to the secondary database server, where the recorded changes are replayed to the local copy of the database. This procedure ensures that the primary and the secondary database servers are in a synchronized state. Using two dedicated TCP/IP communication ports and a heartbeat, the primary and the standby databases track where they are processing currently, the current state of replication, and whether the standby database is up-to-date with the status of the primary database. When a log record is "closed" (still in memory, but has yet to be written to disk on the primary), it is immediately transmitted to the HADR standby database(s). Transmission of the logs to the standbys may also be time-delayed.
Multiple standby
Beginning in Db2 v10.1, the HADR feature supports multiple standby databases. This enables an advanced topology where you can deploy HADR in multiple standby mode with up to three standby databases for a single primary. One of the databases is designated as the principal HADR standby database, with the others termed as auxiliary HADR standby databases. As with the standard HADR deployment, both types of HADR standbys are synchronized with the HADR primary database through a direct TCP/IP connection. Furthermore, both types support the reads on standby feature and can be configured for time-delayed log replay. It is possible to issue a forced or unforced takeover on any standby, including the delphix auxiliary standby. However, you should never use the Delphix auxiliary standby as a primary, because this will impact Delphix performance.
Delphix HADR synchronization
The Delphix for Db2 uses the HADR capability of Db2 to synchronize data from a production Db2 database into a Delphix-controlled Db2 "standby" server. By using this mature and existing Db2 capability, the Delphix Engine is able to ingest data and keep the standby server in sync with only a minimal impact on production. The HADR connection is configured to Super-Asynchronous (SUPERASYNC) mode where log writes are considered successfully transmitted when the log records are sent from the primary database. Because the primary database does not wait for acknowledgments from the standby database, there is no delay on the primary and transactions are considered committed regardless of the state of the replication of that transaction. For further information on Delphix synchronization, see Data sychronization section for more information.
SSL/TLS support for HADR
Starting from Db2 version 11.1 and onwards, the introduction of the HADR_SSL_LABEL configuration parameter marks a significant advancement in Db2's security capabilities. It grants you the ability to establish SSL encryption for your Db2 databases, significantly bolstering security and data protection, particularly during critical operations like replication and synchronization.
With the inclusion of HADR_SSL_LABEL support in the IBM Db2 connector, you gain a streamlined method to seamlessly integrate SSL security within their Db2 HADR setups. This entails that all data replication and communication between primary and standby databases can now benefit from robust encryption in-transit, effectively securing sensitive information from potential threats.
Utilizing this feature within the IBM Db2 connector ensures that you can maintain the utmost security and encryption for their data while it's in transit between the primary and standby databases, reinforcing the integrity and confidentiality of their Db2 environment.
Database partitioning feature
Database Partitioning Feature (DPF) lets you partition your database across multiple servers. Since you can add new machines and spread your database across them, this allows users to scale their database. This means more CPUs, more memory, and more disks from each of the additional machines are available for your database. DPF can be used to manage large databases for a variety of use cases including data warehousing, data mining, online analytical processing (OLAP), or online transaction processing (OLTP) workloads.
DPF enables the user to divide a database into database partitions, a database partition is a part of a database that consists of its own data, indexes, configuration files, and transaction logs. Each database partition can be configured on the different physical server having its own set of computing resources, including CPU and storage. When a query is processed, the request is divided so each database partition processes the rows that it is responsible for. DPF can maintain consistent query performance as the table grows by providing the capability to add more processing power in the form of additional database partitions. This capability is often referred to as providing linear scalability using Db2s shared-nothing architecture.
DPF is an approach to sizing and configuring an entire database system. Please follow the recommended practices for optimal performance, reliability, and capacity growth. Please refer to IBM documentation of DPF for more details in IBM knowledge center.
Backup types supported in DPF
The type of backups supported by IBM Db2 connector when used with DPF are:
-
SSV backups through backup files or named pipes.
-
Non-SSV backups through backup files or named pipes.
Example backup command for DB2 SSV backups: db2_all db2 BACKUP DATABASE<DB_NAME> ON all dbpartitionnums online to <Backup File Location> compress include logs
.
Example backup command for DB2 Non SSV backups: db2_all db2 backup database <DB_NAME> online to <Backup File Location> compress include logs
.
Source database with Raw DEVICE type storage
Several users use raw device-based tablespaces for source Db2 databases. To leverage these environments with Delphix, Delphix has built a workflow using Db2’s native tools that allow Delphix to discover and convert a raw device-based tablespace into an automatic storage-based tablespace during ingestion. Once the data is ingested into staging, end users will be able to provision VDBs of the automatic storage-based database.
Support for named pipe type Db2 backup
Users can also provide Db2 backups of type named-pipe. IBM Db2 connector will check the type (type will be either named-pipe or a file) of backup and will use the restore syntax accordingly.
Purging of archive logs after snapshot operation
Logs can be purged from your log location after the snapshot operation. The Connector will preserve the list of logs that are eligible for purging in the snapshot's metadata. The snapshot metadata can be accessed via the Delphix CLI or API.
Pipelining logic for implementing parallel restores
The connector employs a parallel pipeline methodology so that the restore operation of non-catalog partitions can be performed in parallel in the Database Partitioning Feature (DPF). The number of parallel restores is determined by the value of “restorePipelineLimit” (default value is 10) in <toolkit path>/Delphix_COMMON_<long id>/plugin/DB2_18f4ff11-b758-4bf2-9a37-719a22f5a4b8/db2_plugin.conf. For more details on this please refer to Requirements for Db2 Hosts and Databases. The parameter “restorePipelineLimit” is configurable by end-users. The connector performs parallel restore for all non-catalog partitions. For e.g. If the total number of non-catalog partitions is 15, and the "restorePipelineLimit" parameter is set to 10, the first set of 10 restores will happen in parallel. The connector will track the restore of each partition present in the pipe. Whenever a restore of a partition completes, it will move out from the pipe and a new partition will enter into that pipe. Thus, the connector ensures that the pipe will always have the user-configured number of partitions being restored (default=10).
In case of a restore failure for a particular partition, the connector will track that failure and will print the same in the connector-generated logs so that the user can take the necessary action. If the user received failure during restore operation for a few partitions and then decided to manually initiate the restores for failed partitions, the connector will verify the datapaths during snapshot operation. In case it finds wrong path information (such as paths outside the mount point), then the connector will error out of the operation.
The connector will track the status of each restore in a file named <DB Name>_metadata.json The connector can handle a scenario where the staging host reboots in-between the parallel restores. In this case, the user can perform a “Resynchronisation” operation again. At this time, the remaining restores will resume from scratch. For more details on this please refer to Requirements for Db2 Hosts and Databases
Data synchronization
Delphix provides the following options to keep your dSource in sync with the source database:
-
During the linking process, you can set up a HADR connection between the original source database and the Standby instance. This allows the Standby instance using HADR for log shipping to always keep its databases in sync with the source. It is important to note that a single Standby instance (dSource) can contain multiple databases from multiple different servers and instances as long as each database has a unique name.
-
Users can opt to configure SSL with the HADR connection to enhance security by encrypting data-in-transit between the primary and the standby databases.
-
The users can opt for the "Staging Push" feature by selecting the checkbox "Use Staging Push". With this feature, the user can manage the restore and roll forward of a staging database without involving the Delphix Engine.
-
Users can opt for a non-HADR method of database ingestion. The database will be ingested to a standalone staging host and the user can trigger a resynchronization operation with a most recent online full backup to keep the staging database up to date with the source database.
-
Users can opt for the Delphix Recovery with the end user Supplied Archive Logs method. During an update, the dSource user can trigger a snapshot operation where user-provided archive logs will get applied to the staging database.
-
Optionally, when the DPF feature is in use, users can use the “Archive Logs” feature. Users need to place archive logs inside a folder with a name as NODE<Partition number> where <Partition number> is a four-digit number. For example, if the source environment has two partitions then the user-provided log path will have folder names NODE0000 and NODE0001. Both the folders will have respective archive logs. Snapshot operation will be used to apply the archive logs.
Delphix HADR standby configuration may not meet Db2 DR requirements and should only be used for Delphix use cases.
Banners
Banners are any kind of messages that are printed from commands or scripts that are executed on a new session. They are mainly used to print welcome messages or information regarding the host when you start a new session. There are four ways to accomplish this,
-
Print statements in .profile
/etc/motd
. -
SSH banner configured under
/etc/ssh/sshd_config
. -
Print statements in
.bashrc
. -
Adding print statements or commands that return strings in
.bashrc
. -
can hurt the execution of the connector’s workflows and also basic unix commands such as scp or rsync, etc. Therefore, it is extremely important to avoid using print statements in the
.bashrc
file.