Hello. Welcome to this 4th binary coding sequence. We will talk about data organization. We saw how to represent data in the form of bits, in numbers, images or sounds. Now, let's see how to store them, concretely, how to organize them. Data is stored in a RAM on the hard drive, transferred on the network. It doesn't matter what they match with, there has to be a clear way to tell how they are encoded, how they are organized, in which order, in which format, etc. The computer will either read or generate them, or do both. In both cases, it must understand how they are coded. In practice, there'll be a header that describes the data's size, the potential length, the potential colors, the compression format, etc. The header will specify the data coding. The computer will first look at the header that is with the data, to see how they are stored, then look at the data according to what the header described. So the computer manipulates a certain amount of basic data, including whole numbers, large and small. Whole numbers can be stored on 8 bits, so on 1 bit, so 256 values 2 power 8, or even higher whole numbers - today's computers can manipulate whole numbers on 64 bits too - that means 16 billion billions of values. There are different types of large whole numbers the computer can manipulate efficiently. These whole numbers can eventually be negative. If needed, we can use either positive or negative whole numbers. Whole numbers are good because they are accurate, there are many values, not necessarily enough, because sometimes we'll need to manipulate decimals. In I.T., it's called a floating point number, which is a number followed by a power of 10, an exponent. These numbers can be of great value, much bigger than whole ones, on the other hand, the amount of numbers will be limited to about 10 or 20, according to the kind of floating-point number used. With these 2 big categories of numbers, many things can be done. The computer can manipulate them efficiently. The timer checks what data it has to store and choose the type of data accordingly. So, if it has to store an age, for example - which is between 0 and 100 - it will use a whole number, a positive one, stored on a rather low number of bits. On the other hand, if it has to store temperatures, it will choose a floating number, without much precision, as temperatures don't have to be indicated on a wide range of values. For a level of gray, as seen previously, it will approach 8 bits, rather small whole numbers. Afterwards, more complex data than simply numbers will be manipulated Sometimes, a date has to be stored. A date is made of 3 numbers: the day's, the month's and the year's. A color has 3 components: red, green and blue, meaning a number for each component. There are number combinations to store. They are called Tuples, or structures / recordings in IT. Let's simply say that a tuple is made up of 3 whole numbers, so the day, the month, the year or the quantity of red, green or blue, So the 3 numbers are assembled to create a more complex data made with these 3 values. Eventually, we can make tuples of tuples if we say, for example, that a period is made with 2 dates that are made with 3 numbers; or an image is made with pixels that are made with 3 colors. So we can gather these numbers to create more complex data. Tuples are practical as they enable to assemble numbers, but sometimes the numbers we want to assemble are not known in advance. For example, to manage party guests or a trip's stages, we don't necessarily know how many there will be for sure. So other data types will be used like charts. We'll put in the chart a set of tuples, with a similar structure. That's to say, here, there'll be a name, a date of birth and a height. There can't be anything else different between each line. There will be identical structure elements, that will be displayed one under the other and numbered from 1 to the number of elements in the chart. This allows to manage elements easily. Another way of storing this kind of information is in a list. A list is something that points towards the beginning of a list. Same as before, a list is a tuple. For example, here, the tuple contains a town, a date and an end date. The tuple points towards the next one. So there's a list with a cursor at the beginning and each element points towards the next one in the list. As a structure, it looks like the chart as there's a series of elements of the same type. But in practice, it will change things from an algorithmic, efficiency point of view. Sometimes the chart is more practical, sometimes the list is more practical. For example, if I want to insert an element at the start of a chart, it will be more complicated than in a list, as in the chart, lines have to be moved towards the bottom, that means moving all the saved data that had been stored. Elements don't have to be moved when inserted in a list. An element will just be created in the computer's memory and we'll say, "now the start of the list is this element and it points towards the previous start of the list." Some operations will be more efficient in a chart, and some will be more efficient in a list. It will depend on what we want to do with our data structures. All these data structures will be stored in the memory, concretely. So will they be allocated in the memory? And in which order, what size, etc.? The most important now is to clearly define how to store information and where. There will be a place where we'll specify how the data is organized in the memory. The programmer is the one who will do it when writing his program. At the start, he will choose, for example, to store the chart's lines beginning with the person's name, then their age and then their height. Or in a different order that has to be clearly defined at the beginning of the program so that the whole program recognizes the information when encountered. Specification can't be changed in the middle of the program. Same problem when saving things in a file: one program will have to generate the file while respecting the specs so that the others can understand it. This is what we call an open file, an open file format. If we take a GIF image which is an open file, we'll know in advance that if we want to decode it, there will necessarily be a set of characters at the beginning of the file which is "GIF89a", then 16 bits to store the width, then 16 bits to store the height of the image. So anyone who wants to read a GIF file to display the proper image will know the information is organized that way. Anyone who wants to save an image with this format will know it can be done this way so that others can read it. We'll simply define specs that everyone will have to respect. When the format of the file isn't open, people can generate it but others may not be able to read it. So we can keep in mind that data is made up with basic elements, whole numbers, floating point numbers and maybe letters or other things. and these elements will be combined to represent more complex data with tuples meaning assembling elements, charts or lists, etc. that can contain a lot and enlarge or shrink depending on the programmer and on the user. The most important is to define clearly how things are assembled so that everyone can use them. Specifications must be defined to say, "data is organized in the memory in such way, such order and such size," so that the person who writes and the one who reads can understand each other.