data.

what is data.

In general, data is any set of characters that is gathered and translated for some purpose, usually for analysis.

There are multiple types of data. Data can be a single character, boolean (true or false), text (string), number (integer or floating-point), picture, sound or video.  

Data is processed by the CPU, which uses logical operations to produce new data (output) from source data (input).

data vs information.

The terms 'data' and 'information' are sometimes misinterpreted as referring to the same thing. However, they are not the same.

Data is a collection of values from which information can be ascertained. Those values can be characters, numbers, or any other data type. If those values are not processed, they have little meaning to a human.

Information is data that was processed so a human can read, understand, and use it. Information informs you of something. It answers a specific question. It represents a specific truth or fact.

For example, consider the question: 'what is the temperature outside?’.

Data provides the basis for an answer to that question. 

If the data is '68' and 'Fahrenheit', the answer is: 'Outside, the temperature is 68 degrees Fahrenheit.'

You must know what 'temperature' is, and what 'degrees Fahrenheit' are in order to process the data into information.

Processing data into information is the fundamental purpose of a computer. The 'P' in CPU stands for 'processing,' specifically, data processing.

signal data vs noise data.

Some data is not relevant or informational. This irrelevant data is called noise. 

For example, if you create an audio recording of a piano concert, you might hear people in the audience coughing, or the sound of a ceiling fan. These noises are irrelevant to the purpose of the audio recording, which is to record the sound of the piano.

Information is analogous to a signal. 

In the example above, the relevant data is the sound of the piano. It answers the question, "what did the piano sound like?". The remaining data (the noise) does not answer that question, so it can be ignored or removed.

signal and data processing.

Signal processing is the separation of noise from a signal. A noisy signal is analyzed, and the noise is reduced or removed, to accentuate the signal or isolate it completely.

Similarly, data processing identifies meaningful data, and separates it from the meaningless data. The meaningful data is then interpreted, combined, modified, connected, and structured into something new called information.

examples of data and information.

The following is an example of raw data, and how that data can be assembled into information.

Raw data is a term used to describe data that is collected and stored, but has not yet been processed. 

For example, many sites collect data about each person that visits them. The information they collect is raw until it is processed and sorted to make it easier for the web designer to understand.

Example of Data

CA, 325, Kotar, Piercy Road, SJ, 4084554310, 95138, Data Recovery

In this example, the original data appears to be a set of random words and numbers, separated by commas.

Example of Information

Kotar Data Recovery

325 Piercy Road

San Jose, CA 95238

(408) 455-4310

In this information, the original data was interpreted, organized, and formatted according to predefined parameters. Now the meaning of the data is clear: it is the contact information for a company called Kotar Data Recovery.

how data is stored.

In a computer's storage, digital data is a sequence of bits. A bit is the smallest unit of data that a computer can process and store.

All data, including text files, photos, videos and audio files are stored as binary code, 1s and 0s.

Pure binary data, however, is of little use if you can't read and write in binary. 

The binary number 01001011 01001111 01010100 01000001 01010010, for example, is equivalent to KOTAR.

A bit can have two states: a 0 or 1.

The smallest unit of data is a bit, or binary digit, the 0 or 1.

A byte is 8 bits.

Groups of bits are joined into bytes. A set of 8 bits was chosen because this provides 256 total possibilities, which is sufficient for specifying letters, numbers, spaces, punctuation and other extended characters. Keep in mind that some word processing programs include other sorts of formatting data, and therefore the file sizes become greater than just the number of characters in the file.

A kilobyte (KB) is 1024 bytes.

This is due to computers using the binary system as opposed to decimal. In the Decimal system (base 10), 1 kilobyte is equivalent to 1000 bytes. On the other hand, in the Binary system (base 2), 1 Kilobyte is equal to 1024 bytes.

A megabyte (MB) is 1024 KB.

A gigabyte (GB) is 1024 MB.

A terabyte (TB) is 1024 GB.

All data on storage media, including hard disk drives (HDDs), solid state drives (SSDs), USB flash drives and SD cards store data as bytes.  

But the total size of a database doesn’t tell the full story of the content it carries. For example, 400-500 pages of text can fit inside of a megabyte, while only a short audio file could fit in the same file size.

Let's see how many pages of text files, emails, MS Word Files, PowerPoint slide decks and images can fit in a gigabyte:

Text files: nearly 678,000 pages per gigabyte.

Emails: more than 100,000 pages.

Microsoft Word files: almost 65,000 pages.

PowerPoint Slide Decks: roughly 17,500 pages.

Images: close to 15,500 pages.

how many pages go into each unit of data.

The following figures should give an approximate idea of how many pages can be stored on each unit of data:

Bit: 0.0004 pages.

Byte: 0.005 pages.

Kilobyte: 0.5 pages; therefore, one full page requires about 2 KB.

Megabyte: 500 pages or 1 thick book.

Gigabyte: 500,000 pages or 1000 thick books.

Terabyte: 50,000,000 pages or 1 million thick books.

The Library of Congress in Washington D.C., for example, is said to be the world's largest library with over 28 million volumes. This means that about 28 TB of storage would be required to save a digital backup of the entire Library of Congress.

visualization of larger amounts of data.

It can be difficult to visualize the massive amount of data contained in larger file formats. Here are a few comparisons to put things into perspective:

Half a gigabyte of pages would match the height of the average giraffe.

1 gigabyte would be nearly as tall as a telephone pole.

2 gigabytes would extend across an entire bowling lane.

10 gigabytes’ worth of paper documents would cover the length of a football field.

100 gigabytes would tower over the Burj Khalifa, the tallest skyscraper in the world.

500 gigabytes would be nearly as tall as Mount Kilimanjaro.

1000 gigabytes = 1 terabyte - would almost reach the bottom of the Marianas Trench, the lowest point in any ocean.

data volume is getting exponentially bigger.

The world is creating more data with each passing day, and the amount of digital information is growing at an exponential rate.

According to ‘IDC & Statista’, 2020, there were 74 zettabytes of data in the world at the end of 2021 ( there are a billion terabytes in one zettabyte) and experts predict the world is going to create an even greater volume of data in the near future:

2022 – 94 zettabytes

2023 – 118 zettabytes

2024 – 149 zettabytes

2025 – over 180 zettabytes

According to ‘Cisco, 2020’, the internet population in 2023 will be 66% of the world’s total population. In 2023, each person will have an average of 1.6 networked mobile devices and connections and 3.6 total devices connected to the Internet.

Every second, enormous amounts of data are being created by emails and Excel spreadsheets, social media, streaming data, the Internet of Things, SMS messaging, cloud-based communications, etc.

With that, more issues will appear, including data ownership and data privacy, cybersecurity and an increase in malware. 

On top of that, there is the matter of environmental impact. In a day, nearly 2 million tons of CO2 are emitted by the internet and large amounts of trash are produced by tech. 

Considering a modern hard disk drive can easily exceed 1 terabyte of storage space, this can make data recovery quite a demanding task. It is only going to become more challenging as people continually create more data, new data sources emerge, and average storage capacities increase.

Kotar Data Recovery offers fast and efficient data recovery services for all storage media. Their in-house R&D continuously develops and applies new innovative technologies to constantly adapt to the new emerging data storage trends.

Previous
Previous

holiday season online security.

Next
Next

cyber security.