1) Big Data:
Big Data is a massive volume of both structured and unstructured data that is so large it is difficult to process using traditional database and software techniques. In most enterprise scenarios the volume of data is too big or it moves too fast or it exceeds current processing capacity. Despite these problems, big data has the potential to help companies improve operations and make faster, more intelligent decisions.
Often times, Big Data is used to refer to the technology deployed by organizations to manage their extremely large volume of data.
2) Data Compression:
Data compression is a term used to describe the storage of data in a format that require less space than usual. It is particularly useful in communication because it enables devices to transmit or store the same amount of data in fewer bits. Data compression is widely used in back up utilities and database management systems such as SAP HANA.
3) Parallel Processing:
This is the simultaneous use of more than one CPU to execute a program. Ideally, parallel processing makes a program run faster because there are more engines (CPUs) running it. In practice, it is often difficult to divide a program in such a way that separate CPUs can execute different portions without interfering with each other.
Most computers have just one CPU, but some models have several. There are even computers with thousands of CPUs. With single-CPU computers, it is possible to perform parallel processing by connecting the computers in a network. However, this type of parallel processing requires very sophisticated software called distributed processing software.
4) OLTP/OLAP:
a) OLTP: These are operational systems dedicated to the company’s business to assist in daily management of tasks and they are therefore directly operational. OLTP systems are used to facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing. The data contained in the OLTP system are current, continual, dynamic and constantly updated. It could be read, written and updated.
b) OLAP: These are systems that provide key information to the management of a company. This information is needed to make appropriate business decisions based on the data available internally or externally to the company. The information system that can process this type of data is commonly called a data warehouse. Data required for decision processes is derived from various sources and aggregated. Thus data is extracted, transformed and loaded (ETL process) from an operational system such as SAP ERP to an analytical one for more complex analyses. In the OLAP system, data is summarized, historical, de-normalized with few tables in a star schema. The query is usually complex and ad-hoc.
Differences between OLTP and OLAP;
OLTP
OLAP
Data
Detailed, up to data
Summarized, historical
Database Design
Highly normalized with many tables
De-normalized, few tables in star schema
Query
Simple and standardized
Complex and ad-hoc
Database Size
Mega-gigabytes
Giga-terabytes
Purpose
For fundamental business activities
For decision support and problem solving
Users
IT professional, front end users.
Decision makers.
5) In memory computing:
In-memory computing is the storage of information in the main random access memory (RAM) of dedicated servers rather than in complicated relational databases operating on comparatively slow disk drives. In-memory computing helps business customers, including retailers, banks and utilities, to quickly detect patterns, analyze massive data volumes on the fly, and perform their operations quickly.
SAP HANA uses a technique called sophisticated data compression to store data in the random access memory (RAM), thus making the performance of HANA 10,000 times faster when compared to standard disks, hence allowing companies to analyze data in a matter of seconds.
Advantages of in-memory computing include;
- Ability to cache countless amounts of data constantly. This ensures extremely fast response times for searches
- The ability to store session data, allowing for the customization of live sessions and ensuring optimum website performance.
- The ability to process event for improved complex event processing.
6) HANA Architecture:
7) What is XS Engine?
It is a lightweight application server embedded in SAP HANA that can be used to layer procedural logic and Web services on top of HANA tables and views. Applications that run on the SAP XS Engine can be written by developers in HANA Studio, and they can also be generated by SAP’s River Rapid Development Environment.
XS Engine allows for the creation of applications on HANA using only HANA and a front-end application library like SAPUI5. No separate application server is necessary.
Primarily, XS Engine works on the level of incoming HTTP requests and outgoing responses. A request comes in to a given XS Engine service (which lives at a URL on the HANA system), and the service has complete control over evaluating the request and building the response it wants to send back.
That said, the design and vision of XS Engine is very clearly to support the development of lightweight services and applications in HANA. Developers considering heavier-weight applications should consider using a standalone application server running alongside their HANA system.
8) What is MDX Query?
The Multidimensional Expressions (MDX) language allows users to describe queries and manipulate multidimentional information, such as the data stored in cubes. MDX functions can define calculated members and query cube data.
The basic Multidimensional Expressions (MDX) query is the SELECT statement—the most frequently used query in MDX. By understanding how an MDX SELECT statement must specify a result set, what the syntax of the SELECT statement is, and how to create a simple query using the SELECT statement, you will have a solid understanding of how to use MDX to query multidimensional data.
In MDX, the SELECT statement specifies a result set that contains a subset of multidimensional data that has been returned from a cube. To specify a result set, an MDX query must contain the following information:
- The number of axes that you want the result set to contain. You can specify up to 128 axes in an MDX query.
- The set of members or tuples to include on each axis of the MDX query.
- The name of the cube that sets the context of the MDX query.
- The set of members or tuples to include on the slicer axis
9) What is HANA Studio?
HANA Studio is an Eclipse-based, integrated development environment (IDE) that is used to develop artifacts in a HANA server.
It enables technical users to manage the SAP HANA database, to create and manage user authorizations, to create new or modify existing models of data etc.
It is a client tool, which can be used to access local or remote HANA system.
The SAP HANA studio runs on the Eclipse platform 3.6. We can use the SAP HANA studio on the following platforms:
- Microsoft Windows x32 and x64 versions of: Windows XP, Windows Vista, Windows 7
- S– USE Linux Enterprise Server SLES 11: x86 64-bit version
10) What types of SQL joins are available?
An SQL JOIN clause is used to combine rows from two or more tables, based on a common field between them.
Available sql joins are:
Inner Join: An inner join produces a result set that is limited to the rows where there is a match in both tables for what we’re looking for.
Left Outer: A left outer join, or left join, results in a set where all of the rows from the first, or left hand side, table are preserved. The rows from the second, or right hand side table only show up if they have a match with the rows from the first table. Where there are values from the left table but not from the right, the table will read null, which means that the value has not been set.
Right Outer: A right outer join, or right join, is the same as a left join, except the roles are reversed. All of the rows from the right hand side table show up in the result, but the rows from the table on the left are only there if they match the table on the right. Empty spaces are null, just like with the left join.
Full Outer: A full outer join, or just outer join, produces a result set with all of the rows of both tables, regardless of whether there are any matches. Similarly to the left and right joins, we call the empty spaces null.
Cross: The cross join returns a table with a potentially very large number of rows. The row count of the result is equal to the number of rows in the first table times the number
of rows in the second table. Each row is a combination of the rows of the first and second table.
11) What is Predictive Analytics?
Predictive analytics is the practice of extracting information from existing data sets in order to determine patterns and predict future outcomes and trends. Predictive analytics does not tell you what will happen in the future. It forecasts what might happen in the future with an acceptable level of reliability, and includes what-if scenarios and risk assessment.
When applied to business, predictive models are used to analyze current data and historical facts in order to better understand customers, products and partners and to identify potential risks and opportunities for a company. It uses a number of techniques, including data mining, statistical modeling and machine learning to help analysts make future business forecasts.
Predictive analytics is an enabler of big data: Businesses collect vast amounts of real-time customer data and predictive analytics uses this historical data, combined with customer insight, to predict future events. Predictive analytics enable organizations to use big data (both stored and real-time) to move from a historical view to a forward-looking perspective of the customer.
For example, stores that use data from loyalty programs can analyze past buying behavior to predict the coupons or promotions a customer is most to participate in or buy in the future. Predictive analytics could also be applied to customer website browsing behaviors to deliver a personalized website experience for the customer.
12) What is Session Mgt. in HANA?
This component is responsible for creating and managing sessions and connections for the database clients.
Once a session is established, clients can communicate with the SAP HANA database using SQL statements.
For each session a set of parameters are maintained like, auto-commit, current transaction isolation level etc.
Users are authenticated either by the SAP HANA database itself (login with user and password) or authentication can be delegated to an external authentication providers such as an LDAP directory.
13) What is Persistence Mgt. in HANA?
The persistence service makes available both in-memory and relational database storage to applications that are hosted on SAP HANA Cloud Platform. As well as managing the databases and providing the means to access them, it also performs other tasks such as backup and recovery (24×7), load balancing, and scaling.
The database persistence layer is responsible for durability and atomicity of transactions. It ensures that the database can be restored to the most recent committed state after a restart and that transactions are either completely executed or completely undone.
14) How many engines are available in HANA DB?
There are five engines available in HANA DB namely:
- Join Engine
- OLAP Engine
- Calculation Engine
- ROW Engine
- SQL Engine