SAP HANA smart data integration and other HANA provisioning tools at a glance

(Last Updated On: 28. June 2017)

There are now a variety of integration technologies to data in your SAP HANA to get instance. The technical term for the transmission of data in an SAP HANA is instance of "SAP HANA data provisioning". Second to enumerate a small pool of available technologies, about the data services (DS) were HANA direct Extractor connection (DXC), SAP, smart data access (SDA), SAP LT replication server (SLT), SAP replication server (SRS, formerly Sybase), SAP HANA system replication, the HANA cloud connector u. v. m. In this post we want explicitly SAP HANA smart data integration (SDI) devote, however the Einstazzweck of this technology only are set, if we first deal with the current data replication portfolio of SAP.

What types of replication technologies are there?

When replicating data in an SAP HANA database you categorized the available replication technologies as follows

  • trigger-based replication is used, for example, from the SAP server LT replication. The replication technology installed trigger. Replication Server at the database level, which run in the case of the LT when executing certain SQL statements (INSERT INTO, UPDATE, DELETE). This trigger signal the LT replication server then, that something in the database table to be replicated has changed in the source system. The LT replication server checks then so-called logging tables in the source system to the changes and "repeated" only the Delta to the current database in the target system then by the source system. That is trigger-based technology minimizes the amount of data to be transferred before the transport, which is one of the main advantages of trigger-based replication. Thus this type of data replication so excellently suitable for near-real time data replication, the almost real-time transfer of data to a target system.
  • ETL-based replication is used, if the data from the source system before transferring it to the target system (your SAP HANA instance) into a destination format must be converted, because such is the number of attributes or Changes table columns in the destination table, or the data type of a column. Even if able to do so is, for example, the LT replication server perform simple transformations on table content before it's transferred, you should use rather dedicated proposed technologies for this purpose. Often this type of replication is used to load data of specific business periods of multiple systems for analysis purposes in a HANA
  • Extractor-based data transfer is pretty straight-forward simple extraction of data and the subsequent import of data models in SAP HANA. The technologies proposed by SAP as the DXC extracted the data already in a SAP HANA matching destination format. The manual process of transformation reduces to a minimum.

When do I use which transmission technology for SAP HANA?

If you specifically ask, when you should use what integration or replication technology for SAP HANA, maybe this chapter is a short way way for you. Specifically has for example used a certain strength, and thus an optimal scenario in which it is used, so any integration technology

  • the LT replication server for trigger-based real-time replication of table content (this is not object-oriented). By trigger-based replication is the volume of data to be transferred at the outset reduces. But be careful: the replication of the LT replication server is not object-oriented or transaction-oriented. That means if you about for a "purchase order" the contents of four different tables from the source system must replicate, can it be that the contents of 2 tables at the point X in the target system has arrived, but the content of the other four tables not yet. The LT replication server is so not transaction-oriented, but transmits only the contents of a table in each replication cycle. Here therefore you have no warranty on object-oriented consistency in the target system. Either live with this uncertainty in the target system or use an object-oriented replication, for this kind of Daten.Derzeit the SAP is working on the implementation of the object-oriented replication for the LT replication server, but for SAP objects. For non-SAP objects you must make an effort at present a different technology or your own application logic between the switch.
  • the HANA cloud connector to the unique copy of stack from database tables in a HANA instance in the cloud. Of course, it is possible to transfer data to replication server in a HANA-cloud instance for example in real time about the LT. The HANA cloud connector is only an alternative for batch transmission of data at low cost (low TCO) in effect. If you so need your data in real time, but only intermittently in your target-HANA, HANA cloud connector the technology of your choice.
  • the SAP replication server use, in contrast to the SAP LT replication servers, if you need transactional consistent data replication. SAP Replication Server provides so in contrast to the SAP LT replication server, make sure that changes a database is replicated so as you are changed within a database transaction. Become so within a database transaction (in the SQL statement about what happens to a "commit") changes to, for example, 5 tables connected with each other by foreign key relationships are written are updated in the SAP replication servers all these 5 tables at the same time. This means: Here you have in contrast to LT replication server about your "purchase order" always consistent at the point X in the target system. The disadvantage of this is that replication server for this purpose each transaction individually must transfer the SAP ergo, that replication server must shoveling much more data about the management of SAP as the LT replication no matter in terms of transaction happened how much in between servers replicating only the current state of a table. Another advantage of transactional-oriented data replication is that you can restore the target database exactly so finely granular point in time recovery as the source database. That is when in the source database was accidentally overwritten a data set with incorrect information and they need to recover the old data status of this record just a point in time recovery run in the target database to the last consistent state, and then export the data in the source database, to restore the desired level. That would be an LT with a replication server replicated database cannot (or luck).
  • You use the SAP HANA system replication, if you need transactional replication, source and target system but on both sides of an SAP instance of HANA. That means while with the SAP replication server is usually an AnyDB the source system to bind it in the SAP HANA system replication as a source system also a SAP HANA's database. The SAP HANA system replication has in contrast to the SAP replication server, the advantage is that the HANA to HANA transmission optimized for you and in addition to the mere data yet, for example, the system configuration, HANA database user and other HANA objects such as calculation views etc. can replicate. If you want to learn more about the SAP HANA system replication, read my post on SAP HANA tailored data center integration (SDI).

What are typical use cases for the different SAP HANA data provisioning technologies?

The SAP LT replication server

The SAP is LT replication server, as we have learned, an object-oriented or not transactional data provisioning technology. Because, for example, a transactional consistency must be ensured in financial accounting systems, LT replication is the SAP server so, for example, for the live transmission of financial data to two locations, if at any time a transactional consistency in source and target is necessary. The SAP LT replication server is often used for scenarios in which he can shine. The "LT" in the SAP LT replication server is for landscape transformation. That is the main Einstazzenario of SLT is the consolidation of multiple SAP application systems of the same type in a single system to tighten the system landscape. As LT replication used server as often the SAP, to merge several company codes. This transforms the LT replication server fields of the tables of a company code in the other and thereby replicates the data under the new company code in the target system. Since the SAP LT Replication Server provides no transactional consistency, should do this in both systems downtime planned are. That is not bad but because this consolidation is made of systems already mostly in the context of a migration or an upgrade to a new release. Thus, a downtime window was so from the outset firmly planned. Furthermore suitable SAP LT replication server to optimally for the consolidation of data from multiple sources to a target, to make available for analytical purposes or to provide a search based on data from multiple systems. Because often no transactional consistency is assumed here. Instead, just at the time X the objects analyzed or displayed in the search results image precisely replicated. It accepts the user aware the severity through at the time X still not complete objects that can not be displayed replicated or Analysis can not included in that. Often such as data lakes are built with SLT SAP data for analysis software QlikView, Lumira and Tableaut or make it available to search engines.

When and how does SAP HANA smart data integration (SDI) in the game?

SAP HANA smart data integration is a huge buzzword in the SAP world. The trade fairs fusslig talk to the mouth. What is behind this prodigy? SAP HANA smart data integration aims to combine the advantages of the above different tools for HANA-optimized Einsatzzenarien. Simply put: the above presented data provisioning technologies will continue to have their raison d ' ĂȘtre in scenarios in which they particularly shine (about the merging of company codes with SAP LT replication server or ensuring the high availability of SAP HANA instance through the SAP HANA system replication). But it should be the customers with smart data integration as many as possible of the data replication scenarios for SAP be possible with only a single tool to depict HANA. Smart data integration is so "Black & Decker collect tool kit" for the SAP be HANA data provisioning, which however does not mean that the Professional do without buying special foot – and wrenches. But at least it is possible only with SAP HANA smart data integration (SDI) data Federation using smart perform complex data transformations and data access (SDA) and real time data replication. SDI is a software component each HANA database and must therefore not separately downloaded.

Data Federation with SAP HANA smart data integration (SDI)

What SAP HANA smart data integration (SDI) is also new is the improved support for data Federation processes. It is to connect possible local on-premise source database as well as a Hadoop storage via SDI on the HANA database. Then you can target so the data on virtual, as would this directly in the SAP HANA database stores, though the data in fact continue to exist in its actual source systems, so not copied to SAP HANA, or loaded in the memory of the HANA instance. Therefore, SDI is also a kind of enabler for the integration of data sources as near-line storage (NLS). But the data Federation over virtual tables has other use cases. So it is, for example, with smart data integration as easy as never before, to read data from other data sources from the Web 2.0, social media and news pages, to transform and to persist in the HANA database. It is becoming increasingly important to access data from the social networks of customers, suppliers and partners, and to unite them with the information itself stored in connection with master data management (MDM) about. And the analysis of markets and the company's own image is always easier with the analysis of press pages. Just to access external sources with the help of smart data integration is indeed the flagship scenario for SAP HANA in a big data scenario. SAP HANA smart data integration(SDI) is an essential enabler of SAP HANA, when it comes to using a big data appliance. What makes smart data integration here to the tool of choice? The so-called real time feature. If you have followed so far, for example, the RSS feed for a page, you had to program logic to check that RSS feed regularly for changes and resample the entries in that feed. Under smart data integration set just uncheck "Realtime" and can thus follow the changes of the feed in real time. This sampling should not SAP HANA instance itself, but by a SAP HANA data provisioning agent be, running on a separate system. The provisioning agent is available outside of the internal network and forwards the information to the SAP HANA database in the internal network share which corresponds to the architecture of a demilitarized zone (DMZ) from security point of view. Technical basis for SDI is the smart data access technology, has to do much with it except for the similar names. This is the technical basis for the HANA represent database about as well as Sybase ASE, Sybase iQ and the TREX. The contents of the real-time subscriptions are scanned, for example, so the XML from an RSS feed, and is transformed into a virtual table. This means the XML tags in the RSS feeds are columns in a virtual table, and with this virtual table I can then ETL processes apply to the actual HANA and, for example, the transformed information in the HANA or the near-line storage or in the Hadoop store.

Data replication with SAP HANA smart data integration (SDI)

As we learned earlier, data Federation with SAP is HANA so the basis for this, data from other sources about virtual tables for the SAP HANA database to make. Data replication, however, we replicate data from source databases, such as Oracle 12, DB2, Microsoft SQL Server, Teradata, etc. in real time in the HANA. These data are then also really on the storage of the HANA instance and therefore have the potential to be loaded into main memory and kept in memory. Not only databases, but also Twitter as social media data source can be replicated. Changes in those data sources are replicated in near-real time in the SAP database HANA. Also file sources such as Excel spreadsheets, flat files or an OData service can be used as source. SAP HANA smart data integration (SDI) has the potential, the SAP LT replication server in a scenario the rank to expire. This is the scenario that data from different sources (Hadoop, SAP iQ, Excel sheet, Hadoop, flat files) in near-real time in a SAP HANA database can be replicated. Because SDI offers the advantage of data flows, which much more complex data transformations (see below) the source data be applied to LT as with SAP, before they are persisted HANA in the SAP database in the destination format. Nevertheless, SDI replaces the LT replication server, for example, not in an ABAP to ABAP scenario in which to entertain two SAP systems use RFC connections to trigger a replication.

Data transformation with SAP HANA smart data integration (SDI)

With the data transformation capabilities it is possible to transform data before these references in the target format in the named table or directly in the HANA HANA or be saved. These transformations can the shape of simple SQL operations like joins, where filter or aggregate functions have, or the use of pivots (convert rows to columns or vice versa) or CASE transformations have.

2 Responses

  1. Al says:

    Hi, thank you for this excellent post. I have a question, when connecting source data from an on premise databases (e.g. Oracle) to SAP HANA hosted in the SAP Cloud Platform. Should the SAP Data provisioning Agent be deployed in the DMZ of the on-premise network instead of deploying in a separate host in the internal network zone, where the database is located?

    • DaFRK says:


      from a performance perspective, you should deploy the data provisioning agent as close to the HANA database as possible, meaning in the same network segment. But for security reasons, the moment you are reaching out to data sources located on the web / in the cloud, you should separate the network segments of the data provisioning agent and the SAP HANA database by a firewall and put the agent in the inner DMZ. This is to ensure a defensive layer between the exposed provisioning agent and your HANA database with lots of sensitive data in it.

      Also make sure to separate the network segments for frontend access to the SAP HANA database (HTTPS port) and the backend communication between HANA and the other data provisioning tools (SAP HANA instance port).

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.