1.7 Data

Data is based on bits and bytes. One bit is the smallest unit – it is the building block of all pieces of data. The term ‘bit’ is derived from the term ‘binary digit’. A bit is represented by either a 1 (one) or 0 (zero). 1 is ‘on’, 0 is ‘off’.

A byte is a group of bits that can be in groups of 8, 16, 32 or 64. There are 256 possible combinations of bits 1 or 0 when grouped in eights. Each character of the alphabet, numbers and other symbols have a unique byte combination. This set of combinations is known as a code. The ASCII (American Standard Code for Information Interchange) code is one commonly used code.

The first 10 letters of the alphabet are displayed below with their ASCII codes. Can you see a pattern? Can you work out what the ASCII code for K is?

  • A 01000001
  • B 01000010
  • C 01000011
  • D 01000100
  • E 01000101
  • F 01000110
  • G 01000111
  • H 01001000
  • I   01001001
  • J 01001010
PracticeIT_2_0133.jpg

Interactive 1.1 Deciphering the ASCII alphabet

Memory on the computer is divided into bytes. Each character stored uses about one byte of memory. So the word appointment would use about 11 bytes.

  • 1 kilobyte is about 1000 bytes
  • 1 megabyte is about 1000 kilobytes
  • 1 gigabyte is about 1000 megabytes
  1. Write the following words in bytes (one letter under another as displayed in question 2 below):
    0003.jpg
    BEAD DIG HIDE FADE DICE
  2. Can you decipher this word:
    01000100
    01000101
    01000001
    01000110
  3. Make up your own words using the letters given above and give them to a friend to decipher. Programs such as word processors enable us to simply hit a key on the keyboard and the data is stored instantly. Millions of bytes of data can be stored or recalled every second.

Data while in various combinations of 1 and 0 can represent more than just characters of text and numbers and symbols, also known as alphanumeric. Images and sound are also able to be manipulated, stored and communicated by digital systems. The ability of digital devices to do this has revolutionised the entertainment industry. How data is transferred and stored is looked at in the next section.

Alphanumeric data does not take up a lot of memory. Sound and images can take up a large amount of storage, depending on the quality required for display or broadcast.

An image is a collection of pixels. The mixture of red, green and blue (RGB) is used to determine the actual colour for the pixel. Each pixel can be represented by one eight-bit byte. Eight bits means that 256 colours in a palette can be used. Twenty-four bit true colour is more commonly used now because of advancement in data storage and display. Twenty-four bits means that many more colours can be used – 16,777,216 in fact. This 24-bit colour is also called True Colour. Not every colour is stored with a file but the colours that are used are stored in a palette for the file. The image file then does not have every colour coded for each pixel on the image as this would need a lot of memory. What is stored is a reference to the colour in the palette.

Interactive 1.2 Representing data

In the example below from Codecademy, the shades of each colour are coded. This code is a reference to a palette of colours.

  1. Go to the Codecademy website.
  2. Follow the instructions on screen. You should be able to alter the coding for the name to insert your own, and play around with the colour shades and see the immediate effect on the screen.
PracticeIT_2_0135

Data compression

Data files can be compressed to save space when storing and transmitting. There are two types of compression – lossless and lossy:

  • PracticeIT_2_0136.jpg
  • 01039_shutterstock_16_colour_bitmap.jpg
  • Lossless compression means the original data will be restored, for an example an image will be as it was originally. The advantage is a losslessly compressed file can be restored to the original. For example, converting an image from PNG format to JPEG format will reduce the amount of storage required for the image file, but some data will be permanently lost. Even more data will be lost if you convert it to a BMP file.
  • Lossy compression actually loses data but the data is still recognisable. For example, images used on websites are still easy to view and recognise but if you try to enlarge them it becomes very obvious that they have lost a lot of detail.

    Lossy compression deletes the bits of data that are not necessary for the compressed version. The advantage is that the file does not take up as much memory as it would have. Lossless compression is achieved using compression applications such as WinZip to zip up the files. A file compressed losslessly will take up less memory than the original but more than that compressed result of using a lossy method.

An image is made up of pixels of colour. The higher the number of pixels in a square inch of the image, the higher the quality of the image. However, the higher number of pixels means more memory is required to store the image and process it. Fewer pixels mean less memory is required but the image is not quite as clear as it was, although the content of the image is still recognisable.

Interactive 1.3 Image quality

A video is a set of still images called frames. To create the video, the frames are ordered so that each image is a minute change from the previous image to create what looks like a moving image. A frame rate of 24 to 30 per second is enough for the human eye to interpret the images as continuous motion. However, at this rate it would appear to be flickering. To avoid flicker, a frame rate of about 75 per minute is required. To accommodate this each frame is flashed up to four times.

MPEG (the standard for lossy compression of video and audio) is probably the best known and most commonly used format for video compression (also known as a codec).

Video 1.1 Time-lapse of Sydney harbour – an example of an MPEG 4 video file (00:26 – no audio)