Raw Vs Processed data

0

In data science, raw data is the original data that is collected from a source. It is unprocessed and may contain errors, missing values, and inconsistencies. Processed data is raw data that has been cleaned, organized, and transformed into a format that is more useful for analysis. Here are some of the key differences between raw data and processed data:

  • Raw data is unprocessed. It is the original data that is collected from a source and has not been cleaned, organized, or transformed in any way. Processed data, on the other hand, has been through a number of steps to make it more useful for analysis.
  • Raw data may contain errors. Because it is unprocessed, raw data may contain errors, such as typos, incorrect values, or missing values. Processed data, on the other hand, has been cleaned to remove these errors.
  • Raw data may not be organized. Raw data may not be organized in a way that makes it easy to access and analyze. Processed data, on the other hand, has been organized in a way that makes it easy to use.
  • Raw data may not be in a format that is compatible with analysis tools. Raw data may be in a format that is not compatible with the analysis tools that you want to use. Processed data, on the other hand, has been transformed into a format that is compatible with these tools.

In general, processed data is more useful for analysis than raw data. However, there are some cases where raw data may be necessary. For example, if you are trying to identify patterns in the data that have not been seen before, you may need to use the raw data. 

Benefits of using processed data

  • It is more accurate. Processed data has been cleaned to remove errors, so it is more accurate than raw data.
  • It is more organized. Processed data has been organized in a way that makes it easy to access and analyze.
  • It is more compatible with analysis tools. Processed data has been transformed into a format that is compatible with the analysis tools that you want to use.

Drawbacks of using processed data

  • It may not be as complete as raw data. Processed data may not contain all of the information that was originally collected.
  • It may not be as timely as raw data. Processed data may take some time to be cleaned and organized.
  • It may not be as flexible as raw data. Processed data may be difficult to change or adapt to new needs.

Post a Comment

0Comments
Post a Comment (0)