How are OLAP OLTP data warehouses analytics analysis and data mining related

0 votes

I'm trying to understand what OLAP, OLTP, data mining, analytics etc. are about, and I feel like my understanding about some of these concepts is still a bit vague. Information about these subjects tend to be explained in a very complex manner on the internet.


I feel like a question like this is likely to be closed since it's a very broad one, so I'll try to narrow it down into two questions:

Question 1:

After doing research I understand the following about these concepts, is it correct?

  • Analysis is decomposing something complex, to understand the inner workings better.
  • Analytics is predictive analysis on information that requires alot of math and statistics.
  • There's many type of databases, but they are either OLTP (transactional) or OLAP (analytical).
  • OLTP databases use ER diagrams, and are therefore easier to update because they are in normalized form.
  • In contrast, OLAP uses the denormalized star schema's and is therefore easier to query
  • OLAP is used for predictive analysis and OLTP is usually used in more practical situations since theres no redundancy.
  • Data warehouses is a type of OLAP database, and usually consists out of multiple other databases.
  • Data mining is a tool used in analytics, where u use computer software to find out relationships between data so you can predict things (e.g. customer behavior).

Question 2:

I'm especially confused about the difference between analytics and analysis. They say analytics is multidimensional analysis, but what is that supposed to mean?

Apr 4, 2022 in Machine Learning by Dev
• 6,000 points
496 views

1 answer to this question.

0 votes

From the top of the pyramid, I'll try to explain:

Business intelligence (which you didn't mention) is an IT phrase that refers to a complicated system that extracts usable information from data to provide relevant information about a company.

As a result, the goal of BI systems is to provide clean, accurate, and meaningful data. There are no technical issues when the house is clean (missing keys, incomplete data etc). BI systems are also utilized as a database fault checker in production environments (logical faults - i.e.; invoice bill is too high, or inactive partner is used etc). It was accomplished through the use of rules. Meaningfulness is difficult to define, but in simple terms, it's all of your data (even the excel table from the last meeting) organized in the way you wish.

So, the back-end of a BI system is a data warehouse. A database is all that DWH is (instance, not software). It can be kept in a relational database management system (RDBMS), an analytical database (columnar or document store kinds), or a NoSQL database.

The phrase "data warehouse" is commonly used to refer to the entire database described above. There may be a number of data-marts (if the Kimball model is employed) - or a relational system in its third normalized form (Inmon model) known as an enterprise data warehouse.

Data marts are tables in DWH that are related to one another (star schema, snowflake schema). Fact tables (denormalized business processes) and dimension tables

One business process is represented by each data mart. DWH, for example, has three data marts. Retail sales are one, export is another, and import is the third. Total sales, quantity sold, import price, profit (measures) by SKU, date, shop, city, and so on are all available in retail (dimensions).

The process of loading data into a DWH is known as ETL (extract, transform, load). Data should be extracted from multiple sources (ERP db, CRM db, excel files, web service...) Data transformation (clean data, connect data from diff sources, match keys, mine data) Data should be loaded (Load transformed data in specific data marts) edit as a result of a comment: ETL processes are typically built using an ETL tool or by hand using a programming language (python, c#, etc.) and APIs.

The ETL process is a collection of SQLs, processes, scripts, and rules that are linked and divided into three stages (see diagram above), all of which are governed by meta data. It's either pre-recorded (every night, every few hours) or live (every few minutes) (change data capture, triggers, transactions).

Data processing methods include OLTP and OLAP. OLTP stands for "online transaction processing" and is used to transfer data between a database and software (there is usually just one route to input and output data). OLAP stands for "online analytical processing," which indicates there are various sources, historical data, fast select query performance, and data that has been mined.

edit as a result of a comment: Data processing is the process of storing and retrieving data from a database. As a result, the database is set up differently depending on your demands.

Credits for the imagehttp://datawarehouse4u.info/:

enter image description here
The computer technique of detecting patterns in big data sets is known as data mining. Data mining can provide you with a more detailed perspective of a business operation or even a forecast.

In the area of business intelligence, analysis is a verb that refers to the ease with which you can extract the information you need from data. Multidimensional analysis describes how your data is sliced by the system (with dimensions inside the cube). According to Wikipedia, data analysis is the process of evaluating data in order to identify usable information.
Analytics is a term that refers to the outcome of an analysis process.
Don't make such a big deal out of just two words.

Elevate Your Data Analyst Skills with the Best Course. - Join Our Leading Data Analyst Course!

answered Apr 4, 2022 by Nandini
• 5,480 points

Related Questions In Machine Learning

0 votes
1 answer

If both negative and positive skewness are present in data set,then how it can be removed??

Hi@shama, It depends on your use case. If ...READ MORE

answered Dec 8, 2020 in Machine Learning by MD
• 95,440 points
546 views
0 votes
1 answer
0 votes
1 answer

DBSCAN algorithm and clustering algorithm for data mining

You can use any distance function with ...READ MORE

answered Mar 4, 2022 in Machine Learning by Dev
• 6,000 points
837 views
0 votes
1 answer

Orange Data Mining and Logistic Regression

A statistical classification method that fits data ...READ MORE

answered Mar 15, 2022 in Machine Learning by Dev
• 6,000 points
1,148 views
0 votes
1 answer

Is it possible to access Couchbase by terminal?

What you want to accomplish isn't possible ...READ MORE

answered Apr 29, 2018 in DevOps & Agile by DareDev
• 6,890 points
480 views
0 votes
1 answer

How to connect Java program to the MySQL database?

You can connect your Java code with ...READ MORE

answered May 11, 2018 in Java by Parth
• 4,630 points
1,610 views
0 votes
1 answer
+1 vote
3 answers

How do I check and export sonarqube analysis reports in embedded db?

Browse to sonarqube web instance to access ...READ MORE

answered Jul 2, 2018 in DevOps Tools by DareDev
• 6,890 points
27,466 views
0 votes
1 answer

What is the difference between linear regression and logistic regression?

Hi Dev, to answer your question Linear Regression ...READ MORE

answered Feb 2, 2022 in Machine Learning by Nandini
• 5,480 points
913 views
0 votes
1 answer

Assumptions of Naïve Bayes and Logistic Regression

There are very few difference between Naive ...READ MORE

answered Feb 7, 2022 in Machine Learning by Nandini
• 5,480 points
297 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP