• Portada
    • Recientes
    • Usuarios
    • Registrarse
    • Conectarse

    Hardlimit Museum

    Programado Fijo Cerrado Movido General
    90 Mensajes 11 Posters 25.1k Visitas 3 Watching
    Cargando más mensajes
    • Más antiguo a más nuevo
    • Más nuevo a más antiguo
    • Mayor número de Votos
    Responder
    • Responder como tema
    Accede para responder
    Este tema ha sido borrado. Solo los usuarios que tengan privilegios de administración de temas pueden verlo.
    • V Desconectado
      vreyes1981
      Última edición por

      Olé!! little by little, si señor. ?

      And I like the new design, with the magazine covers really big. What a blast of nostalgia. ?

      1 Respuesta Última respuesta Responder Citar 3
      • cobitoC Desconectado
        cobito Administrador
        Última edición por

        The first feature of the year arrives and it's for the museum. We're launching a new disc browser that has become the germ for the future file browser that will be an extension of this. The new front-end has been used, so the views are optimized for both large screens and mobile devices. As it is thought, it should be possible to navigate in a comfortable and intuitive way all kinds of media: in addition to magazine discs, also operating systems, drivers, demos, compilers, etc. The content will be arriving. The entire table structure is already created and in the next iteration, the concept will be put into operation with a couple of discs or three. Due to space limitations, for the moment only the PCManía CDROMs are available. I will look for temporary solutions. I have a 2TB disc around that I might be able to use as a temporary medium. For its part, the disc card (medio.php) of which you have an example here, shows the cover along with information about the medium. The information that is shown (if available) is as follows:
        • Name of the publication.
        • Size of the medium.
        • Format of the medium.
        • System with which it was originally created.
        • Volume
        • Editorial
        • Entity that prepared the data
        • Program that was originally used for creation
        • MD5 sum
        • SHA-256 sum
        Download links are also shown for the file (.iso,.zip) or files (.cue/.bin) of the medium that you want to download and the downloads are enabled, in principle, without limits. What comes next will depend mainly on the storage space that I manage to rescue from discs that I have stored around.

        Toda la actualidad en la portada de Hardlimit
        Mis cacharros

        hlbm signature

        _Neptunno__ 1 Respuesta Última respuesta Responder Citar 5
        • _Neptunno__ Desconectado
          _Neptunno_ MODERADOR @cobito
          Última edición por

          @cobito I love the hard work you're putting into the Museum and the new interface for the disk browser. As a professional in the digital preservation sector, I can only take my hat off to you.

          Let me tell you something, because you'll like it: my company is dedicated precisely to digital preservation on a global scale. We work with clients ranging from the National Library of Spain (BNE) to international ones like Harvard University, the Holocaust Museum in the US, or HILA (Stanford University), among many other top museums and universities, setting up systems that cost hundreds of thousands of euros. And I'll tell you something: seeing what you're achieving with limited resources has an impressive value; the structure with the MD5, SHA-256 sums and the metadata cataloging is at a level of rigor that many institutions would envy.

          That's why I'm so excited about it. From my humble opinion (keep in mind that I'm just a technician, but after so many projects, and especially the one at the BNE where I saw how millions of pages were published thanks in part to my work over the years for people to access them from home), your work seems to me professional and necessary. I know how difficult it is to digitize and give visibility to these archives, and yours is something to take your hat off to.

          It's vital to give visibility and access to this content before it's lost forever. In fact, if you want, I'll talk to my company so that you can negotiate with them for a safebox and we'll take the Museum to the big leagues jajaja

          Seriously, you're doing great work and the community will always be grateful. As soon as I get my hands on the 486 or the Pentium 166 I have around here, I'm sure I'll be using this material to the fullest!

          Hugs!!

          cobitoC 1 Respuesta Última respuesta Responder Citar 5
          • cobitoC Desconectado
            cobito Administrador @_Neptunno_
            Última edición por

            @_Neptunno_ You're going to make me red ?

            You have no idea how happy and motivated you make me with what you say. In the technical aspect of the subject, I don't have as many doubts. It's the typical thing that can be done well in many ways and I probably am not doing it perfectly, but not badly either. But from the point of view of structuring and organization, I follow my intuition more than "technical" standards. My references are Archive.org and WinWorldPC. And well, the experience of this being the third time I try (let's see if the third time is the charm).

            @_Neptunno_ said in Hardlimit Museum:

            We work with clients ranging from the National Library of Spain(BNE) to others internationally such as Harvard University, the Holocaust Museum in the USA or HILA (Stanford University) among many other top museums and universities, setting up systems that cost hundreds of thousands of euros.

            Having someone with this track record come and tell you that you're not on the wrong track is very motivating. As you are probably the biggest expert in Hardlimit (certainly, you are the biggest expert in the field that I know), any criticism or suggestion you have, whatever it is, please comment on it.

            This is something I want to do well and since the economic cost (not in time) of development is zero and that the hardware resources can be scavenged around, it would be great if the formal aspect was done with rigor.

            It's still very green, both in functionality and content. So as it evolves, you will probably see things that can be improved (if you haven't already seen them).

            Thank you!

            Toda la actualidad en la portada de Hardlimit
            Mis cacharros

            hlbm signature

            _Neptunno__ 1 Respuesta Última respuesta Responder Citar 4
            • _Neptunno__ Desconectado
              _Neptunno_ MODERADOR @cobito
              Última edición por _Neptunno_

              @cobito I'll do everything I can to help you, as you know it's a pleasure, even if you don't think I'm at the level of a preservation engineer! ?

              I think you would enjoy learning a lot from my development colleagues; to program the software that manages not only Teras, but petabytes of information, there is an incredible amount of work behind it (not to mention a thousand things that were not known, such as the "Transfer Connector": a critical function that acts as a bridge for ingesting data in a secure and structured way). In that part, I have less to say, since my role is more of a system support, but I have also been in many digitization projects of all sizes and I can't help but see your museum with professional eyes. I find it admirable and, above all, very useful for the community!!

              To give you an idea of the scale, the systems we set up are responsible for preserving "digital knowledge" for the next 200 years (or so I heard in a meeting a while ago, haha). We always say that if this had existed in the time of the Library of Alexandria, nothing would have been lost today. The data is stored in redundant systems that constantly audit the files to ensure they are healthy and that the disks maintain their integrity. If something fails, there are several more copies in the pools ready to come in while the damaged disks are replaced or "with a lot of running around".

              Even in the face of catastrophes or wars, measures are taken. With the conflict in Ukraine, for example, critical copies were already being moved to safe locations (For example from UK to Ireland) to prevent the information from being lost in case of direct confrontation.

              That being said, I must point out that I am only the "last guy" in my company! But these things are cool and I tell you about them as a bit of gossip from the world ?

              Best regards!!

              cobitoC 1 Respuesta Última respuesta Responder Citar 5
              • cobitoC Desconectado
                cobito Administrador @_Neptunno_
                Última edición por

                @_Neptunno_ What a find. A while ago I read about a new Rosetta stone (in the sense that it was recently created) where information was stored in a spiral reducing the size of the characters from visible to the naked eye to microscopic with a similar goal to the original Rosetta stone (to have a translation table). And then, to reproduce it and distribute it all over the world, as a way of having that redundancy.

                After some searching, I found it. It's called The Rosetta Project and what they have done is create a disk that stores 13000 pages of information in 1500 languages in a way that the text can be read with a microscope (no digital information).

                You probably know about it, because looking at the page, it's run by Stanford University which your company has worked with. You may have even participated in some related development!

                Anyway, what I meant to say was that this was the only example I knew of an attempt to preserve knowledge on that scale. I would never have imagined that there were so many resources put into preserving knowledge at the level you describe.

                Very interesting, indeed.

                Toda la actualidad en la portada de Hardlimit
                Mis cacharros

                hlbm signature

                _Neptunno__ 1 Respuesta Última respuesta Responder Citar 4
                • _Neptunno__ Desconectado
                  _Neptunno_ MODERADOR @cobito
                  Última edición por

                  @cobito the most important part of my company is preservation, which is where the most resources are invested and is the "soul" of the company. Then there is the digitization department, which was the one that started everything.

                  At first we were fully involved in massive digitization, something quite "mechanical" (scanning material and generating metadata and bibliographic records of the images). We worked with specific scanners to obtain TIF files (300dpi although over time they went up to 400 and 600 with very specific cameras for this) and then generate the derivatives (JPG, PDF). Depending on the project, some were simple and others required massive renaming and complex structuring, such as periodicals (newspapers and magazines). To give you an idea, we have digitized everything from the AS newspaper to press from the beginning of the 20th century for the BNE, passing through projects with the Prado Museum or the University of Granada (some project of the Royal Hospital of Granada), among others.
                  But minor projects were also done for Town Halls of Villages and Urban Planning files, to put another example. But we are going to show you that it has been many years of doing everything.

                  My job for a long time was precisely that image processing, although I also did system support. In parallel, the company grew exponentially developing digital preservation software. That software is the one that manages everything now. We no longer just digitize documents, but we guarantee that they remain intact and accessible.

                  If you go to the BNE Digital Library, there you can see millions of pages in which I have contributed in a small way with my work. As a curious anecdote: I worked with collections of photos from the Civil War that were not open to the public due to copyright, but that the BNE had to preserve to release them in the future. Seeing those images was quite shocking ?

                  I'm sorry to disappoint you, but that project of the Rosetta Project at Stanford is something they develop on their own. That said, as a client we work with their infrastructure and, well... you would laugh if I told you that we have had to complain about being so stingy. They gave us virtual machines with system disks of 127GB (the default size of Hyper-V) and we had to protest so that they put in "more juice", because they were constantly filled with logs ?

                  Best regards!!

                  1 Respuesta Última respuesta Responder Citar 5
                  • cobitoC Desconectado
                    cobito Administrador
                    Última edición por cobito

                    I have finally been able to organize all the museum content. I rescued a 2TB drive from the storage room that will give me plenty of room to continue with this. In addition, I have been able to make space on the backup drive, so the part that bothered me the most about using a drive in this way is resolved. At first, all the magazine discs are already there. @vreyes1981, your Micromanía discs are there, specifically from the year 2002 (full year) and 2003 (until April), which is what you had uploaded. If you have more material around, don't hesitate to put it (let me know if you do so I can add it). The database structure has also been finalized to take the next step: the file explorer. The initial phase will allow you to navigate the directory tree of all available drives and in the next turn of the Museum, it should be operational. I will see if I dedicate time to the test bench or the museum next.

                    Toda la actualidad en la portada de Hardlimit
                    Mis cacharros

                    hlbm signature

                    1 Respuesta Última respuesta Responder Citar 2
                    • cobitoC Desconectado
                      cobito Administrador
                      Última edición por cobito

                      This week we launched the first phase of the file browser. All files from almost all disks are already indexed and browsable. We left out a couple of MDF/MDS from Micromanía that we can't extract despite being able to extract most MDF/MDS. Here you have a disk where, in addition to the information that was already announced, now includes the list of files and folders browsable. In addition to the file explorer, it is possible to view detailed information of each file along with duplications, that is, files that could have different names, dates, and other attributes but whose content is the same. The information that is shown is as follows:
                      • Name of the file in the current medium and route.
                      • Original creation date in the medium.
                      • Size in ISO/IEC 80000-13 binary format.
                      • The type of content in a descriptive way (from here for most it will not show information; it will be added over time).
                      • MIME type (it is not 100% precise, but almost).
                      • A more detailed description of the content of the file, for example, if it is a self-extracting executable, details of the encapsulated content are given.
                      • An MD5 signature
                      • An SHA256 signature
                      Here an example of pkunzip.exe which is a pretty popular file. In this first phase, it has been a challenge to be able to determine the character set of the file system. Sometimes UTF-8 is used, other times CP850 and there are a couple of images out there that look like they came out wrong from the beginning due to some failure in the creation software (apparently, it was not uncommon in the 90s). In any case, the names of the files are shown correctly with their es, accents, and other regardless of the original format. We have 570,000 files. This comes from the media of the five publications that we have at the moment, which are being used as a reference for all development before adding much more. The second phase of the browser has also begun, which consists of extracting all extractable files. And of the extracted ones, repeat the operation recursively. At the moment we are compatible with more than 70 compressed formats from all eras that are determined by heuristics, not by extension or by magic number, which avoids that any compatible file escapes. About this, when this first version of the extractor is polished, it will be put into production. Over time, more formats will be added but for now we stay with those 70-80 to prioritize other aspects. In another order of things, a change has been made in the v86 configuration that makes the response of the virtual machines now much faster (example). It was something that I didn't notice by going locally, but now that all traffic goes through a VPS, I realize details of performance that can be improved (dependent on latency). And to finish, the algorithm behind the new page translation system has been corrected making pages render much faster now (this affects both the museum and the test bank). This change, in addition, has stopped breaking certain functions such as the magnifying glass in the magazines and the capture of the mouse pointer in the virtualizations.

                      Toda la actualidad en la portada de Hardlimit
                      Mis cacharros

                      hlbm signature

                      1 Respuesta Última respuesta Responder Citar 2
                      • cobitoC Desconectado
                        cobito Administrador
                        Última edición por

                        It is now possible to view the contents of the extractable files. These include any type of file and the content can be other files or sections of binaries. The sections often contain binary data without importance, but in others, there is embedded relevant content such as images, sounds, animations, cursors, plain text and so on (mainly in DLLs). Here is an example of a.zip that in turn contains another zip. In the description you can see the format and the algorithm that is used in each of them. 70% has been indexed. I suppose that during the weekend the process will be completed. At the moment, they add up to 1.3 million files/sections to the ones we already had. With this, phase 2 is in principle finalized (waiting to complete the indexing, which is an automatic process) and the most arid part of the topic is closed until new formats are added (there is already a list for the next iteration). Possibly phase 3 will begin soon, which consists of being able to visualize the files from the browser. This is: images, videos, sounds, midis, mods, documents, etc, etc, etc. It is one of the coolest parts of the explorer for which it has been necessary to previously do what has been done so far and with which it can be converted into a powerful tool for digital archaeology. On the other hand, the front-end has been consolidated throughout the museum (except for hardware cards): now the new one is already being used in all sections, which gives it a better appearance on the desktop and fixes many things that were broken in the mobile version. In addition, videos have been added about each hardware and software that I uploaded at the time to Peertube: example.

                        Toda la actualidad en la portada de Hardlimit
                        Mis cacharros

                        hlbm signature

                        1 Respuesta Última respuesta Responder Citar 2
                        • cobitoC Desconectado
                          cobito Administrador
                          Última edición por cobito

                          Phase 3A is in process at the same time as it has entered production (there is still a lot to process, but that is already automatic). Standardized and free formats are being used for visualization. The idea is that, regardless of the original format, it can be viewed in any browser, because one of the problems with old files is that many times they stop being reproducible due to codec/format/algorithm issues. The chosen formats are: VP9 for video, OPUS for audio and WebP for image.

                          The following types of files can be visualized from the browser:

                          Images, videos and audios
                          These three types of "media" are being extracted by heuristic. That means that a large amount of images, videos and audios are coming out even from files that are not identifiable as such. In the case of images, a lot of metadata is also shown. Over time, metadata will be added to videos and audios and it is possible that histograms of color will be shown in images (this information is already being captured; it just remains to show it).

                          For their part, OCR is being passed to the images. If the text is well-defined enough, the results are not bad. It will be used in the future to search for text in images although, from now on, OCR is shown when the image is visited.

                          Example 1 of image
                          Example 2 of image (representative example of OCR)

                          MIDIs
                          The midis come in six flavors (neither more nor less). They have been rendered in:

                          • OPL2 (synthesizer).
                          • OPL3 (synthesizer).
                          • Gravis UltraSound (official patches).
                          • Roland MT-32 (official roms).
                          • FluidR3 (modern soundfont).
                          • ToH (modern soundfont).

                          In some cases, it has not been possible to extract MT-32, GUS and/or ToH. Many old MIDIs are malformed, do not follow the standard, etc.

                          Example of MIDI
                          Example 2 of MIDI

                          MODs
                          They are being rendered to comply as much as possible with the Amiga's Paula thanks to OpenMPT. Initially, several versions were going to be offered, but here the panorama is more gloomy and it seems that all efforts are focused on this implementation. Here, too, files are being scanned by heuristic and the truth is that very interesting things are coming out, such as PSM files that were a kind of Epic MODs used in things like their Pinball or in Jazz Jackrabbit (I had to look up what this is because I had no idea it existed).

                          Example of PSM of Jazz JackRabbit
                          Another PSM of Silver Pinball (precursor of the Pinball of Epic Megagames)
                          Normal MOD
                          Another MOD

                          File browser
                          In another order of things, when you access any folder, a selection of all these files that cover the current folder and all the upper subfolders is shown (up to 6 files per type). As you navigate through the folders, the "media" shown will be narrowed down. And if you click on a viewable file, all the information is shown next to the visualization. What comes out comes in order of "importance" and importance consists of the number of pixels in images and duration in the rest.

                          Processing is slow (we are at 2%). We are still going through the first media on the list. Here is an example:
                          Root directory of PCMania 21

                          In addition, it is possible to see all the files of a specific type from the current directory. For example, here are all the images of PCMania 27.

                          To finish, icons have begun to be shown next to files and folders to make them more identifiable. There are still many left, but they will be added gradually. For this, Unicode characters are being used since I am getting into the normalization of formats and encodings.

                          There is one thing: this is supposed to be a file search engine. But I went to look for a few files to put in this thread as an example and it turns out that I forgot to implement the search engine and I didn't realize it until now. So that's what I'll try to have next time.

                          Phase 3b consists of doing the same thing with documents: txt, rtf, wp5.1, pdfs, docs, etc, etc, etc. But this will be left for much later.

                          Toda la actualidad en la portada de Hardlimit
                          Mis cacharros

                          hlbm signature

                          1 Respuesta Última respuesta Responder Citar 0
                          • 1
                          • 2
                          • 3
                          • 4
                          • 5
                          • 5 / 5
                          • First post
                            Last post

                          Foreros conectados [Conectados hoy]

                          0 usuarios activos (0 miembros y 0 invitados).
                          febesin, pAtO,

                          Estadísticas de Hardlimit

                          Los hardlimitianos han creado un total de 543.5k posts en 62.9k hilos.
                          Somos un total de 34.9k miembros registrados.
                          roymendez ha sido nuestro último fichaje.
                          El récord de usuarios en linea fue de 123 y se produjo el Thu Jan 15 2026.