Data Modelling is the process of structuring and organizing data so that it can be effectively analyzed. This may involve creating models to represent the data, as well as cleansing and transforming the data to make it more consistent and easier to work with. Data Modelling is often used in conjunction with other methods, such as statistical analysis, machine learning, and natural language processing, in order to glean insights from data.
Data modelling is a critical part of any text analytics project, as it can help ensure that the data is organized in a way that will make analysis easier and more accurate. By taking the time to carefully structure and cleanse data before beginning any analysis, text analytics practitioners can save time and avoid potential errors down the line.
Data Modelling is similar to other methods used for data analysis, such as data mining, predictive modelling, and business intelligence. However, Data Modelling typically focuses on the structure of data, rather than on the relationships between data points or on identifying patterns in data.
Parameters in Data Modelling :
- identifying entities and attributes
- defining relationships between entities
- designing a database to store the data
Data Models Types :
1. Relational Model. The relational model is a database model based on first-order predicate logic. It was created by Edgar F. Codd as part of his research project at IBM in 1970.
The relational model is based on the idea of relating data in the form of tables, which are similar to the way data is related in mathematics. Each table, or relation, contains a set of columns, or attributes, which define the data that can be stored in that table. Tables can also be related to one another through foreign keys, which are attributes that match up with the primary key of another table.
2. Non-relational or NoSQL Model. Non-relational, or NoSQL, databases are a newer type of database that has become increasingly popular in recent years. Unlike relational databases, which store data in tables, non-relational databases store data in a more flexible format that can be easily scaled and updated.
NoSQL databases are often used for big data applications where the data is too complex or too large to be effectively stored in a relational database. They are also often used for applications that require real-time data access, such as online gaming and social media.
3. Object-oriented Model. The object-oriented model is a type of database model that is based on the idea of objects. Objects are data items that contain both attributes (data) and methods (functions).
4. Dimensional Model. The dimensional model is a type of database model that is based on the idea of dimensions. Dimensions are data items that can be used to describe an object. For example, a dimension might be “length”, “width”, or “height”.
5. Hierarchical Model. The hierarchical model is a type of database model that is based on the idea of hierarchies. Hierarchies are data items that can be arranged in a tree-like structure. For example, a hierarchy might be “parent-child”, “sibling-sibling”, or “employee-manager”.
6. Network Model. The network model is a type of database model that is based on the idea of networks. Networks are data items that can be connected to one another through links. For example, a network might be “computer-network”, “telephone-network”, or “road-network”.
7. Multi-dimensional Model. The multi-dimensional model is a type of database model that is based on the idea of multiple dimensions. Dimensions are data items that can be used to describe an object. For example, a dimension might be “length”, “width”, or “height”.
7. Semantic Model. The semantic model is a type of database model that is based on the idea of semantics. Semantics is the meaning of words, phrases, and symbols. The semantic model is similar to the relational model, but it is more flexible because it can store data in any format, not just tables.
8. Physical data Model. The physical data model is a type of database model that is based on the physical structure of the data.
Benefits of Data Modelling :
- easier to understand the data
- easy to make changes in the future
- helps with performance tuning
- can be used to generate Database code
Disadvantages of Data Modelling :
- can be time-consuming
- requires a good understanding of data
- may be difficult to change later on