A critical vulnerability in the Java library Log4j was reported (see https://www.govcert.ch/blog/zero-day-exploit-targeting-popular-java-library-log4j/ for more details).
The Big Data Link Server is currently affected by this vulnerability. If you use Big Data Link, you should replace line 47 of the start-up script 'servers/big_data_link_server/bin/big_data_link_server' with:
DEFAULT_JVM_OPTS=-Dlog4j2.formatMsgNoLookups=true
In case of questions, please contact our support desk: openBIS Support openbis-support@id.ethz.ch
With Big Data Link it is possible to track files in openBIS without storing the files themselves into openBIS. Only the metadata is stored in openBIS.
oBIS
oBIS is a command line tool which interacts with files and openBIS. It provides functionality to track file changes. This helps with data-provenance tracking. New versions of files or new files can be registered in openBIS, creating a history of data sets.
Installation
oBIS is based on pyBIS, git and git-annex. Before installing obis make sure that git and git-annex are available.
oBIS (and pyBIS) can be downloaded from the pypi repository, from the openBIS git repository (git@sissource.ethz.ch:sispub/openbis.git) or from a built release (openBIS Download Page, Clients and APIs).
Downloading and installing from pypi is probably the easiest:
$ pip3 install obis $ pip3 install pybis
To install oBIS from the git repository:
$ git clone git@sissource.ethz.ch:sispub/openbis.git $ cd openbis $ pip3 install obis/src/python $ pip3 install obis/src/pybis
The installation process of obis wants to add manual pages to the system. If this leads to an 'access denied' error comment out the line data_files=data_files,
in obis/src/python/setup.py
(that is, add a hash symbol '#' in front of this line).
Commands
Help
Typing 'obis' list all commands. Typing 'obis <command> --help' shows options and parameters of the specified command.
Settings
With 'get' you retrieve one or more settings. If the 'key' is omitted, you retrieve all settings of the 'type'.
obis [type] [options] get [key]
With 'set' you set one or more settings.
obis [type] [options] set [key1]=[value1], [key2]=[value2], ...
With 'clear' you unset one or more settings.
obis [type] [options] clear [key1]
With the type 'settings' you can get all settings at once.
obis settings [options] get
The option '-g' can be used to interact with the global settings. The global settings are stored in ~/.obis and are copied to an obis repository when that is created.
Following settings exist:
type | setting | description |
---|---|---|
config | allow_only_https | Default is true. If false, http can be used to connect to openBIS. |
config | fileservice_url | URL for downloading files. See DownloadHandler / FileInfoHandler services. |
config | git_annex_backend | Git annex backend to be used to calculate file hashes. Supported backends are SHA256E (default), MD5 and WORM. |
config | git_annex_hash_as_checksum | Default is true. If false, a CRC32 checksum will be calculated for openBIS. Otherwise, the hash calculated by git-annex will be used. |
config | hostname | Hostname to be used when cloning / moving a data set to connect to the machine where the original copy is located. |
config | openbis_url | URL for connecting to openBIS (only protocol://host:port, without a path). |
config | obis_metadata_folder | Absolute path to the folder which obis will use to store its metadata. If not set, the metadata will be stored in the same location as the data. This setting can be useful when dealing with read-only access to the data. The clone and move commands will not work when this is set. |
config | user | User for connecting to openBIS. |
data_set | type | Data set type of data sets created by obis. |
data_set | properties | Data set properties of data sets created by obis. |
object | id | Identifier of the object the created data set is attached to. Use either this or the collection id. |
collection | id | Identifier of the collection the created data set is attached to. Use either this or the object id. |
repository | data_set_id | This is set by obis. Is is the id of the most recent data set created by obis and will be used as the parent of the next one. |
repository | external_dms_id | This is set by obis. Id of the external dms in openBIS. |
repository | id | This is set by obis. Id of the obis repository. |
The settings are saved within the obis repository, in the .obis folder, as JSON files, or in ~/.obis for the global settings. They can be added / edited manually, which might be useful when it comes to integration with other tools.
Example .obis/config.json
{ "fileservice_url": null, "git_annex_hash_as_checksum": true, "hostname": "bsse-bs-dock-5-160.ethz.ch", "openbis_url": "http://localhost:8888" }
Example .obis/data_set.json
{ "properties": { "K1": "v1", "K2": "v2" }, "type": "UNKNOWN" }
Init
obis init [folder]
If a folder is given, obis will initialize that folder as an obis repository. If not, it will use the current folder.
Init_analysis
obis init_analysis [options] [folder]
With init_analysis, a repository can be created which is derived from a parent repository. If it is called from within a repository, that will be used as a parent. If not, the parent has to be given with the '-p' option.
Commit
obis commit [options]
The 'commit' command adds files to a new data set in openBIS. If the '-m' option is not used to define a commit message, the user will be asked to provide one.
Sync
obis sync
When git commits have been done manually, the 'sync' command creates the corresponding data set in openBIS. Note that, when interacting with git directly, use the git annex commands whenever applicable, e.g. use "git annex add" instead of "git add".
Status
obis status [folder]
This shows the status of the repository folder from which it is invoked, or the one given as a parameter. It shows file changes and whether the repository needs to be synchronized with openBIS.
Clone
obis clone [options] [data_set_id]
The 'clone' command copies a repository associated with a data set and registers the new copy in openBIS. In case there are already multiple copied of the repository, obis will ask from which copy to clone. To avoid user interaction, the copy index can be chosen with the option '-c'. With the option '-u' a user can be defined for copying the files from a remote system. By default, the file integrity is checked by calculating the checksum. This can be skipped with '-s'.
Note: This command does not work when obis_metadata_folder is set.
Move
obis move [options] [data_set_id]
The 'move' command works the same as 'clone', except that the old repository will be removed.
Note: This command does not work when obis_metadata_folder is set.
Download
obis download [options] [data_set_id]
The 'download' command downloads the files of a data set. Contrary to 'clone', this will not register another copy in openBIS. It is only for accessing files. This command requires the DownloadHandler / FileInfoHandler microservices to be running and the 'fileservice_url' needs to be configured.
Addref / removeref
obis addref obis removeref
Obis repository folders can be added or removed from openBIS. This can be useful when a repository was moved or copied without using the 'move' or 'copy' commands.
Examples
Create an obis repository and commit to openBIS
# global settings to be use for all obis repositories obis config -g set openbis_url=https://localhost:8888 obis config -g set user=admin # create an obis repository with a file obis init data1 cd data1 echo content >> example_file # configure the repository obis data_set set type=UNKNOWN obis object set id=/DEFAULT/DEFAULT # commit to openBIS obis commit -m 'message'
Commit to git and sync manually
# assuming we are in a configured obis repository echo content >> example_file git annex add example_file git commit -m 'message' obis sync
Create an analysis repository
# assuming we have a repository 'data1' obis init_analysis -p data1 analysis1 cd analysis1 obis data_set set type=UNKNOWN obis object set id=/DEFAULT/DEFAULT echo content >> example_file obis commit -m 'message'
Big Data Link Services
The Big Data Link Services can be used to download files which are contained in an obis repository. The services are included in the installation folder of openBIS, under servers/big_data_link_services. For how to configure and run them, consult the README.md file.