Extraction, Transformation and Loading (ETL) SAP NetWeaver BI offers flexible means for integrating data from various sources. Depending on the data warehousing strategy for your application scenario, you can extract the data from the source and load it into the SAP NetWeaver BI system or directly access the data in the source without storing it physically in the Enterprise Data Warehouse. In this case the data is integrated virtually in the Enterprise Data Warehouse. Sources for the Enterprise Data Warehouse can be operational, relational datasets (for example in SAP systems), files or older systems. Multidimensional sources, such as data from other BI systems, are also possible. Transformations permit you to perform a technical cleanup and to consolidate the data from a business point of view. Extraction and Loading Extraction and transfer processes in the initial layer of SAP NetWeaver BI as well as direct access to data are possible using various interfaces, depending on the origin and format of the data. In this way SAP NetWeaver BI allows the integration of relational and multidimensional data as well as of SAP and non-SAP data. ● BI Service API (BI Service Application Programming Interface) The BI service API permits the extraction and direct access to data from SAP systems in standardized form. This can be SAP application systems or SAP NetWeaver BI systems. The data request is controlled from the SAP NetWeaver BI system. ● File Interface The file interface permits the extraction from and direct access to files, such as csv files. The data request is controlled from the SAP NetWeaver BI system. ● Web Services Web services permit you to send data to the SAP NetWeaver BI system under external control. ● UD Connect (Universal Data Connect) UD Connect permits the extraction from and direct access to both relational and multidimensional data. The data request is controlled from the SAP NetWeaver BI system. ● DB Connect (Database Connect) DB Connect permits the extraction from and direct access to data lying in tables or views of a database management system. The data request is controlled from the SAP NetWeaver BI system. ● Staging BAPIs (Staging Business Application Programming Interfaces) Staging BAPIs are open interfaces from which third party tools can extract data from older systems. The data transfer can be triggered by a request from the SAP NetWeaver BI system or by a third party tool. Transformation With transformations, data loaded within the SAP NetWeaver BI system from the specified interfaces is transferred from a source format to a target format in the data warehouse layers. The transformation permits you to consolidate, clean up and integrate the data and thus to synchronize it technically and semantically, permitting it to be evaluated. This is done using rules that permit any degree of complexity when transforming the data. The functionality includes a 1:1 assignment of the data, the use of complex functions in formulas, as well as the custom programming of transformation rules. For example, you can define formulas that use the functions of the transformation library for the transformation. Basic functions (such as and, if, less than, greater than), different functions for character chains (such as displaying values in uppercase), date functions (such as computing the quarter from the date), mathematical functions (such as division, exponential functions) are offered for defining formulas. Availability Requirements for Data in SAP NetWeaver BI For different business problems, the data might need to be more or less up-to-date. For example, if you want to check the sales strategy for a product group each month, you need the sales data for this time span. Historic, aggregated data is taken into consideration. The scheduler is an SAP NetWeaver BI tool that loads the data at regular intervals, for example every night, using a job that is scheduled in the background. In this way no additional load is put on the operational system. We recommend that you use standard data acquisition, that is schedule regular data transfers, to support your strategic decision-making procedure. If you need data for the tactical decision-making procedure, data that is quite up-to-date and granular is usually taken into consideration, for example, if you analyze error quotas in production in order to optimally configure the production machines. The data can be staged in the SAP NetWeaver BI system based on its availability and loaded in intervals of minutes. A permanently active job of SAP background processing is used here; this job is controlled by a special process, a daemon. This procedure of data staging is called real-time data acquisition. By loading the data in a data warehouse, the performance of the source system is not affected during the data analysis. The load processes, however, require an administrative overhead. If you need data that is very up-to-date and the users only need to access a small dataset sporadically or only a few users run queries on the dataset at the same time, you can read the data directly from the source during analysis and reporting. In this case the data is not archived in the SAP NetWeaver BI system. Data staging is virtual. You use the VirtualProvider here. This procedure is called direct access.
Dec 08
ETL Concepts
Permanent link to this article: https://blog.openshell.in/2010/12/etl-concepts/