6.1 Data Storage

Nov 10, 2019

Chapter Overview

DATA IS STORED IN VARIOUS FORMS BUT THE COMPUTER CONVERTS ALL OF IT TO BINARY DATA (MACHINE LANGUAGE)
  • Compression
    • Technique
      • Lossy Compression
      • Lossless Compression
    • Ways to compress
  • File formats
    • Musical Instrument Digital Interface (MIDI)
    • Sound & Video
    • Digital Image
    • Text and number
  • File size estimation

6.1.1 Compression

Compression Techniques

LOSSY COMPRESSION

  • Unnecessary bits are discarded during compression
  • Reconstruction of the original file is impossible
  • Suitable for audio, video and image files during streaming to increase upload/download speed
  • E.g. MP3, JPEG

LOSSLESS COMPRESSION

  • Does not lose information during compression
  • All bits from the original file can be reconstructed when files are uncompressed
  • Suitable for text/numbers where the data needs to remain the same for it to make sense, or when high quality is required in musictracks
  • E.g. Zip

Ways to compress

  • Using compression software (codecs)
    • Compression codecs are designed to remove data without losing quality (where possible)
    • Algorithms work out what data can be removed and reduce file size
  • Splitting large files into smaller files
  • Removing images from files and send them separately
  • Uploading file to storage in cloud and share folder via email

6.1.2 File Formats

MUSICAL INSTRUMENT DIGITAL INTERFACE (MIDI)

Communications protocol that allows electronic musical instruments to interact with each other

FEATURE

  • Consists of a list of commands that instruct a device on what to do
    E.g. Key ON and OFF, Key pressureVelocityPitches or notes played, Aftertouch
  • Use 8-bit serial transmission with 1 start bit and 1 stop bit

HOW IT WORKS

  • MIDI controller (a piece of hardware or software) creates and transmits MIDI data to other MIDI devices
  • Sequencer will send playback instructions (data) to any connected and compatible output components

ADVANTAGE

  • Can be edited easily and played back to sound like any instrument
    Reason: MIDI is a recording of data events, not the actual audio
  • Very small file size
    Reason: It only represents the playback information, not the actual audio

APPLICATION

  • Store ringtones on phones

Sound & Video

SOUND – KEY CONCEPT

  • Bit Depth: the number of bits available for each clip
    • The higher the bit depth, the higher the quality of audio, because it represents how detailed the clip is recorded
    • Bit Depth is usually 16 bits on CD24 bits on DVD
      A Bit Depth of 16 has a resolution of 65,536 because 16 bits means each sample can be any binary value between 0000 0000 0000 0000 or 1111 1111 1111 1111
  • Bit Rate: the number of bits used per second of audio
    • Bit rate = frequency * bit depth * channels
      Amount of channel is usually two (Human has two ears)
    • Storage = bit rate * length of song
  • Sample rate (Hz): the number of audio samples captured each second
    • Standard audio sample rate is 44,100 Hz
    • Telephone networks and VOIP services use 8 kHz sample rate to save memory since human voice can still be heard clearly
    • The higher the sample rate, the higher the quality

SOUND – MPEG-3 (MP3)

  • Use Perceptual Music Shaping technology to remove sounds that human ear can’t hear clearly
  • Able to compress normal music file by about 90% whilst retaining most of the quality
    • E.g. a 60 MB CD track is turned into a 6MB MP3 file
  • Music data is often stored on MP3 players which plug into computers via USB port

VIDEO – KEY CONCEPT

  • Bit rate: total audio and image data processed every second
  • Created from a series of static images played at a high speed
    • Around 24 frames per second (fps) up to about 100 frames per second or more
    • HD film is about 50 or 60 fps.
  • TV and computer screens have a specification in Hz to indicate the frame rate they support.

VIDEO – MPEG-4 (MP4)

  • Operate similar to MP3
  • Allow the storage of multimedia rather than just sound
    E.g.Videos, Photos, Animations
  • Compressed using Lossy technique to
    • Reduce the resolution
    • Reduce the dimensions
    • Reduce the bit rate

Digital Image

KEY CONCEPT

  • Pixel: the smallest controllable element of a picture represented on the screen
  • Pixel Resolution: a measure of pixel density, usually measured in dots per inch (dpi)
    • The higher the resolution, the higher the quality, but also more memory is needed
  • Color/Bit Depth: Indicate how many colors are available for each pixel
    • The higher the bit depth, the more colors can be stored and the larger the file size
    • Black and white image uses two colors, hence color depth is 1 bit (0 or 1)
    • 8-bit image can store 256 colors (2^8). Each pixel requires 1 byte of memory
  • Color table: contain information about all available colors
  • Metadata: information about the image, include:
    • Filename
    • File Type
    • Color Depth
    • File dimensions
    • Resolution
    • Time and date the image last changed
    • Camera setting when the photo was taken

DIFFERENT TYPES OF IMAGE

1. BITMAP
Organised grid of colored squares called pixels.

KEY TERMS

  • Bitmap-file header: type, size, the layout of bitmap
  • Bi-map-information heater: the dimension, compression type, color format

FEATURE

  • Each color on the image is stored as a binary number
    • In a black-and-white image, each pixel is either black(represent as 1) or white(represent as 0)
  • Information about the pixels is stored with the file (metadata)

ZOOMING

0_1543125724447_difference-between-bitmap-and-vector copy.jpg

When zooming in or enlarging a bitmap image, pixels are stretched and made into larger blocks. This makes the bitmap images appear as poor quality when enlarged too much.

EXAMPLE

  • Tagged Image File Format (TIF)
    • Lossless compression
    • Standard format for storing images for printing/publishing industry
    • Developed for storing high-quality images
    • Larger than JPEG files
  • Joint Photographic Experts Group (JPEG)
    • Lossy compression
      A new file is created after compression by reducing the raw bitmap image between 5%-15%
    • Standard format for storing images in digital cameras and displaying images on the web.
    • Smaller than TIF files
  • Portable Network Graphic Format (PNG)

2. VECTOR

FEATURE

  • Store the image by using scalable shapes (e.g. straight lines/curves), coordinates and geometry to define parts of the image.
  • Instructions to the computer on how to process the image are stored with the file

ZOOMING

0_1543126339314_difference-between-bitmap-and-vector.jpg

Zooming the image won’t destroy the quality because instructions rather than actual image features are stored with the file.

ADVANTAGE

  • More efficient than bitmaps at storing large areas of the same color
  • Can be scaled without losing resolution
  • Can be enlarged/reduced in size, but file size stays almost the same

APPLICATIONS

  • CAD package
  • Animated movies

EXAMPLE: SVG


Text & Number

DIFFERENT FORMATS OF STORAGE

  • American Standard Code for Information Interchange (ASCII)
    • Use a byte to produce a possible combination of 256(2^8) characters
    • Useful for representing standard English
    • Limited character set
  • Unicode
    • Use 2 bytes to produce a possible combination of 65536(2^16) characters
    • Useful for representing a wider range of characters

COMPRESSION — RUN-LENGTH ENCODING (RLE)

  • An example of Lossless compression
  • Convert consecutive identical values into a code consisting of the character and the number marking the length of the run
  • The more similar values, the more can be compressed
  • Sequence of data is stored as a single value and count

6.1.3 File size estimation

EQUATION

  • Expansion = total bytes per file * expansion rate
  • File size = total bytes per file + expansion
Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.