Introduction to automatic collation

color gradiant

This is the home page of the online workshop Introduction to automatic collation that will take place on the 24th and 25th of September. This course is offered by the Programme Doctoral en Études Numériques of the University of Lausanne (UNIL).

From this home page you can access:

Please forward any doubts, problems or suggestions to the course instructors: Elena Spadini (UNIL) - Elisa Nury (UNIGE/UNIL) - Helena Bermúdez Sabel (UNIL)

Logistics

The course is held online on the Zoom platform hosted at UNIL. Participants will receive the room ID and password by email shortly before the beginning of the course. There is no need to have a Zoom account to join the room. You can access the room via the browser or the Zoom desktop app.

Program outline

Day 1

Day 2

Basic info about ...

... Github

Github is a platform to host, share and review code. It is used by developers and researchers to collaborate on the development of software for any kind of project, open-source or not. A Github repository, or repo, is the place where all the files of a project are hosted; a Github repo is "physically" a directory (a folder) that can be cloned or downloaded to your computer. Github uses the git distributed version control software, that is a way to save changes over time without overwriting previous versions of a file: this is very useful for collaboration.
Examples of a Github repo? The Catalogue of Digital Editions' data repo; the repo that contains all the materials of this workshop; the repo of this website.

How will we use Github during the workshop? As a place to host the workshop materials. We will download them together at the beginning of the workshop, you don't have to do anything before. They will remain available in Github after the workshop as well.

... Binder

Binder makes possible to create a computing environment that can be shared and used by many remote users: each user access a virtual computer (or virtual machine) ready to be used, where all the required programs have been installed and data are available. In our case, we use mybinder.org, a service which works with Github repositories: we just give to it the URL of our Github repo and Binder creates a virtual computer for each participant, where s/he can access and execute all the files provided in the workshop materials. We should be aware that all the changes we make are lost when we close the virtual computer created by Binder, for example when we turn off our computer at the end of the day. But we can export everything that we've done!

How we will use Binder during the workshop? We will launch Binder together at the beginning of the workshop, by clicking the button in our Github repo. You don't have to do anything before.

... Jupyter Notebooks

The workshop materials are mainly Jupyter Notebooks. A Jupyter Notebook is first of all a file, like a normal file on your computer, a .pdf or a .docx file: its extension is .ipynb (if you don't see file extensions in your computer, this is the right moment to fix it! Just search the web for "how to make extension visible" in your operating system –Windows, MacOS or Linux– and you will find plenty of very easy instructions). To open a Jupyter Notebook you will need the right environment, just as to open a .docx file you need Microsoft Word or another program capable of doing it. The good news is that the right environment to open a Jupyter Notebook is available in the virtual computer created by Binder, as we've seen above (and should be available on your computer if you chose to follow the optional installation instructions). A Jupyter Notebook is a special sort of file, because it can contain text (to be read) and code (to be run, or executed): the text is written in the Markdown markup language and the code in the Python programming language.

How we will use Jupyter Notebooks during the workshop? We will use them for all the practical parts of the workshop. We will see together how a Jupyter Notebook works in session 1 and introduce the Python language in session 2. If you've never used the Markdown language before, you might have a look at this cheat sheet and follow this tutorial before the workshop (max 30 mins; we suggest to use the English version, because the others contain translation errors); but again, this is not compulsory.

Credits

Some of the materials collected here have been prepared by the instructors specially for this course. Others reuse tutorials from Many persons over the years have contributed to these materials, among which Ronald Haentjens Dekker (the main developer of CollateX) and, in alphabetical order, Tara Andrews, Helena Bermúdez Sabel, David Birnbaum, Elli Bleeker, Elisa Nury, Leif-Jöran Olsson, Elena Spadini, Catherine Smith, Joris Van Zundert.

License

The materials are released under a GPL 3.0 license.