A partnership between Hal-Inria (Inria’s institutional repository), the CCSD and Software Heritage has resulted in the creation of a new type of scientific deposit. The submission in HAL allows citability, while long term preservation is being supported by Software Heritage.
To enable the transfer to Software Heritage, the deposited file must be under a free license and not embargoed.
What to deposit as software source code
Currently, it is only possible to upload a single .zip or .tar.gz archive. Researchers who use version control systems and wish to archive the development history as well should take this into account and wait for future developments of the deposit possibilities.
Preparation of the source code
Prepare the software source code before submitting it:
- Add the following files:
- README: describes your software, see also Make a README)
- AUTHORS: contains the list of authors and potential contributors
- LICENSE: describes the rights of use of the deposited source code (to choose in collaboration with the specific services of the authors’organization), see a list of licenses
- Create a .zip archive
- Name the archive with the software name + version
How to deposit
Clicking on the Submit tab
File(s) section
Upload the compressed archive. You can add only one archive. The .zip must contain all files. The maximum size is 200 Mo.
Document’s metadata section
Check that the document type is “Software”, then add the required metadata:
- Software title
- License/s: free input (the licence/s should be consistent with the content of LICENSE file). See also a list of Open Source licenses
- Domain. The domain can be added before the submission in My profile/Submission preferences.
Display the complete list of metadata to add other information:
- Description (the description should be consistent with the content of the README file)
- Keywords
- Production date
- Classification
- ANR or european projects
- Software specific metadata:
- Programming language
- Code repository: link to the repository where the un-compiled, human readable code and related code is located (SVN, github, CodePlex)
- Platform/OS: operating systems supported (BSD, MacOSX, Windows 7, OSX 10.6, Android 1.6)
- Version: version of the software instance
- Development Status: description of development status, e.g. concept, WIP, active, inactive, suspended
- Runtime Platform: platform or script interpreter dependencies (Example – Java v1, Python2.3, .Net Framework 3.0)
Author metadata
It is recommended to add all the authors mentioned in the AUTHORS file. Then add at least one affiliation.
Different roles can be assigned to authors:
- Developer
- Maintainer
- Contributor
Conditions for the transfer to Software Heritage
Validate the transfer and save the submission. The submission will be manually checked before to be put online.
Identification of the software and source code
If you have accepted the transfer of your archive to Software Heritage, a direct source code identifier will be included in the HAL record and in the citation. This identifier is a swh-id with the format swh:1:dir:aaaaaaaaaaaaaaaaaaaaaaaaaaa.
To be able to reproduce an experiment, knowing the exact version of the software used is essential. Software Heritage will provide the swh-id, intrinsically bound to software components, ensuring persistent traceability across future development and organizational changes. The swh-id, like a fingerprint of the Software is specific, persistent and unique. It does not depend on an ID resolver.
It is calculated by Software Heritage through cryptographic hash functions during the ingestion of your codes in the global archive. With swh-id you can find your codes in the Software Heritage archive, browse the content online and download the source code.
How to use the persistent identifier ?
Software Heritage guarantees a very long-term stable identifier and each version of the identification scheme will be maintained even when it is labeled obsolete (in the case of collisions on SHA1 hashes). In addition, the identifier is not an url but it is resolvable on several resolvers, including Software Heritage resolver: https://archive.softwareheritage.org/
The neutral url and the contextual url – The swh-id can be used in an url as it is or in a contextual way, in order to display on the web-app of the archive additional information concerning the origin of the referenced object.
The neutral url towards the deposited object with the identifier:
swh:1:rev:a27a59f6b14c9fb13a6f998d8316628dafc1f60c
- The contextual URL that allows the association of the object to its origin:
swh:1:rev:a27a59f6b14c9fb13a6f998d8316628dafc1f60c;origin=https://hal.archivesouvertes.fr/hal-01727745
The citation of a software
The software is a legitimate and valuable research product. The citation format proposed on HAL contains some of the mandatory metadata submitted with the software and the persistent identifiers that make it possible to locate it.
Example of citation: