In this article:

Introduction and Overview

   Role restrictions

   The Batch tool interface

   Tips for the upload CSV

Migrate records with administrative batch management

Uploading files


Introduction and overview


Why is there no "too long; didn't read" section? Because we implore you to read this entire document! By using the batch tool you are levelling up your skills and will need all the information here. 


The batch tool is designed to be used for any repository structure and therefore requires a bit of planning and testing before use. Please read through this full document to understand how it works and what you will need to think about for your repository. The first section provides a high level overview. Further down, you will find a detailed description of a migration process using the tool and below that specific guidance on uploading files.


This functionality enables the creation and management of many records at once within your Figshare repository, up to thousands of records. This is not just for items you own, but any item that you have the permission to edit via impersonation. You can find Batch Management tool in the drop down menu from your account icon in the upper right or you account page.


IMPORTANT NOTE - YOU MUST READ THIS: Please ensure that the metadata that you intend to publish using batch management is thoroughly tested on your stage environment, to prevent the need to make changes after initial publication. Before publishing and/or updating items using batch management, it is very important to be aware of changes to metadata that cause item versioning. Figshare does not support removal of versions. While admin users can unpublish individual items (in order to update and republish without creating a new version), removal of versions in batch is not supported. In the event that you unexpectedly create multiple versions via batch management, our Support Team will charge for assisting to remove versions in batch, and the timeframe for carrying out the work cannot be considered as urgent.‍


Role restrictions

This functionality is restricted to Institutional Administrators only.‍ That is, admins in the top level group.


The Batch tool interface

‍On the page, you can find two sections that will be detailed below, the Download Metadata section and the Upload Metadata and files section. 

 



Download metadata

The output of this section will be a CSV containing the metadata of the requested items. This will be sent asynchronously in a zip file to your account's email when it has been prepared. 


There are 2 main options to consider: 


  • Do you want metadata of only public items or all items in any state (public/private/draft)?
    • The state of the object will be shown in the download sheet
  • Do you want the metadata of objects from across your whole repository or from specific groups only?


If you have a variety of metadata across different groups, then each metadata field will be represented in each own column and will be empty for items that don’t have those fields applied to them.  



Upload metadata and files 

This section is used for both the creation of new items and the editing of existing items. You can choose to publish items or push the new changes or newly-created items straight to private. If you choose to publish, all items included in the upload sheet will be published.


If your repository has the review module enabled, you can choose to automatically approve the items. This will not skip the review process, but will create a Review entry and approve it from your login. 


If you are creating new items, it’s a good idea to download the existing metadata spreadsheet first. This way, you’ll see all of the existing metadata fields and how they are organized. If you don’t have any existing items to download, create a private item utilizing all of your fields to see what the upload sheet should look like. Please note that the CSV download will include columns that may not be relevant to your upload process. This is described more below.


The only thing that will not be present on the download sheet that you may want on the upload sheet is a files column.


You can attach files to items from any publicly available link (http and https), including FTP. To do this, you’ll need to add a column called 'files'. See the file uploading section on this page for detailed information.


Once you have created and uploaded your sheet, Figshare will enact the requested changes. Depending on the number of requested changes, this can take minutes to hours.


Note: To ensure that your requested changes are processed in the shortest possible amount of time, please make sure to only include in the sheet those items that you wish to update. 


When these changes are complete, a CSV will be sent to your account email with the success or failure information for each item. The row number in this will be the row the item was in the original upload sheet. 


The editing and creation process works on an “all or nothing” approach per item, so even if only one aspect fails or is incorrect then no changes to the item will occur. The one caveat here is that the metadata upload and file upload are separate actions that happen in sequence: if the metadata for an item is correct, the item will be created, but the file upload may still fail or succeed.


Tips for the upload CSV

  • account_id: Leaving this blank will create items in your admin account.
    • If you need the account IDs of users who have not created any items yet, you can use this endpoint: https://docs.figshare.com/#private_institution_accounts_list
    • Adding an account ID will show that item is created via impersonation when investigating that item. This will create the item in the desired account and then that user will be able to make edits in the future.
  • group_id: Leaving this blank will add items to the default group, i.e. the top level group.
  • project_id: add an id here only if you want to add the item to an existing project
  • categories: This field will accept more than just the name any of the following formats work, just note that the square brackets are required: ["category name", {"id": 1234}, {"title": "another category name"}]
    • The id is Figshare's id for that category. The best way finding that category in a public item (use Figshare search) and use the API endpoint to look at the metadata for that item: https://docs.figshare.com/#public_article
  • Fields to ignore when using the download CSV for an upload: 
    • doi and handle - only if you are not uploading pre existing identifiers (this functionality needs to be enabled for your repository)
    • status, private_link, references, private_link, resource_title, resource_doi
    • references, resource_title, resource_doi  are all deprecated- use related_materials instead
  • Embargoes
    • is_embargoed: 0 = No, 1 = Yes
    • embargo_type is either file (files only) or article (embargoes the whole item, metadata and files)
    • embargo_allowed_administrators = 0 = No, 1 = Yes 
  • HTML tags are supported for the description and title, e.g. <p>This is a description. </p> <p>It is of an item</p>
  • Date formatting: for any field where a date is added like publication date, acceptance date, first online date or custom date field, the format YYYY-MM-DD must be used.
  • Time Limit: By default, one download and one upload are allowed per hour. If you need this to be temporarily increased, please contact support.  


Migrate records with administrative batch management


Common issues

Getting started

Migrating to different groups

User accounts and author linking

    Migrating items into researcher accounts

    Connecting migrated items to author accounts and profiles

Adding funding information

Adding related materials

Add existing DOIs or Handles

A possible migration workflow


Common Issues

Here are the most common issues and how to solve them:

  • Item is not created - Likely there is a metadata issue: check that columns have correct names and capitalization. Specific field issues will be noted in the error report
  • Missing metadata - Either the column name is incorrect or the content is not formatted correctly. Biggest one is dates- use yyyy-mm-dd and check this every time before saving your CSV since Excel usually changes date formats. 
  • Files not uploading - Check that the file column is named "files", note lowercase 'f' and plural! Or, the file is not actually available to the Figshare system. Try pasting the file url into your browser. If the file doesn't immediately download or doesn't display in your browser's pdf viewer, it's not accessible. Note that epub formats usually won't work.


Getting started

At this point you should have your repository set up with the groups and custom metadata fields you will need for the migration. We advise using your stage instance to create dummy draft items in each group that you will be migrating into. In these dummy items, fill in any custom metadata fields, add categories, keywords, funding, and create some examples of embargo files. You can use the batch management tool to download the metadata from these dummy items to use as a template for uploading.


Three other pieces of advice: 

  1. Send a support ticket to enable the existing DOI/Handle upload for both Stage and Production
  2. Send a support ticket to temporarily reduce the wait time between batch uploads while you are conducting tests and migration (by default this is set to 60 minutes)
  3. Carefully check formatting in your CSV file before uploading- especially date formats as some spreadsheet programs change these automatically


Migrating to different groups

You may want to use separate CSV files to upload items by group rather than migrating items to different groups using one CSV file. This will make it easier to make sure the metadata in the CSV is formatted correctly and that the custom fields are filled out properly.


You can get a list of your group ids from this endpoint: https://docs.figshare.com/#private_institution_groups_list (Make sure you create an API token from a top level administrator account and paste it in the top left field on that API documentation page.)


Put the group id in the group_id column in the CSV and items will automatically be associated with that group. 


User accounts and author linking

TL;DR: use account_id to put records in an author’s repository account so they can edit the record. Use user_id to affiliate a specific author with a record whether they own the record or not.


In Figshare, there can be two ids associated with a researcher. If a researcher has an account in your repository (whether created by SSO, HRFeed, or manually) they will have an account id. If a researcher is listed as an author on a repository item, whether they have an account or not, they will have a user id. You can use the account id to give edit access to a researcher for migrated items. A researcher with an account will also have a profile. But please note that the profile page uses the person’s user id since that is how they are associated to records, whether they ‘own’ the record or not. You can use the user id when uploading metadata to make sure items show up in researcher profiles. This also enables better reporting because it reduces duplicated author names across repository items.


Migrating items into researcher accounts

If you want researchers to have edit access to an item, you need to put their account id in the account_id column in the CSV. If you do not add an account_id, the item will be uploaded to the administrator account that is running the batch upload. 

You will need to have the researcher accounts created before migration in order to get the account_id (to provide edit access), and the user_id (to link them as an author if needed). The best way to do this is create the accounts manually through the API and add the SSO id in the “institution_user_id”. You can also create accounts through an HRfeed or, once your repository is launched, you can ask researchers to login through SSO which will automatically create their account. You can then retrieve the account_id and user_id as needed.

You can see the account_ids either in the User Report or from this API endpoint: https://docs.figshare.com/#private_institution_accounts_list.


Connecting migrated items to author accounts and profiles

You could upload author names as ‘first name’ and ‘last name’ for each item but this is not recommended for existing authors. Each first/last name combination will receive its own database entry - lots of duplicates! -  and will make reporting more difficult than it needs to be.


To connect an item with an existing author account, simply add the author to the CSV item using the user_id. You can see the user_ids either in the User Report or from this API endpoint: https://docs.figshare.com/#private_institution_accounts_list. Add the authors in the ‘authors’ column in this format: 


[{"id": 1438453}, {"id": 1438451}, {"id": 701402}, {"id": 1438455}]


If you add any other data (like “first name”) after the id value, it will be ignored.  Adding authors in this way will automatically connect the item to the author information stored in the database including ORCID and CRIS/RIM id. The item will show up in each author’s profile if they have an account in any Figshare powered repository.


Note: If your repository will be integrated with a CRIS/RIM system like Symplectic Elements, you will need to add authors using the user_id so that the item can be harvested into the CRIS/RIM system properly.


Note: If you want an item in your repository to show up in a user account in another Figshare repository, like in figshare.com or at another university, you will need to get that author’s user id from their profile page. This person’s user id is the number at the end of the profile URL: https://figshare.com/authors/_/473204.

Adding funding information

You can add funding information as free text by including the grant name in the funding column using this format:


[{"title": "My grant 1"},{"title": "My grant 2"}]


You can also link grants from large funders to the grant item in the Dimensions database. At this time, this is a two step process. You need to find the id for the grant in the Figshare system using this API endpoint: https://docs.figshare.com/#private_funding_search (you’ll need to add your API token to the field in the top left). If you find the grant, add the id to the funding column like this:


[{"id": 9621728},{"id": 3058082}]


The two grant items for those ids are pictured below.


Adding related materials

Links to related materials, like a published paper, dataset, or different version of a paper, are added in the related_materials column. As with authors and funding, the content needs to be formatted as JSON. This is an example:


[{"identifier": "10.1038/s41550-020-1208-y", "title": "The ecological impact of high-performance computing in astrophysics", "identifier_type": "DOI", "relation": "IsSupplementTo", "is_linkout": 1}]


The ‘identifier_type’ field is sourced from DataCite’s RelatedIdentifier list. The ‘relation’ field is sourced from DataCite’s list of relation types. The ‘is_linkout’ field determines if the linked title shows up in a call-out box on the record page and can take the value 0 (zero) or 1 (one). Up to five links can be shown as call-out boxes.


Add existing DOIs or Handles

You may want to upload items that already have a DOI or a Handle. Add the DOI or Handle in the appropriate column without the URL information (e.g. 10.1636/P10-15.1). Your repository needs this functionality enabled before this will work. You can send a support ticket requesting the pre existing DOI/Handle functionality be turned on for some or all groups.


A possible migration workflow

Every institution will have unique migration needs. Please use this as just an example. This workflow assumes the records will be migrated into one administrator account for editing but the items are linked to existing researcher user_ids.

  1. Create custom metadata fields and create dummy records in your stage instance.
  2. Set up researcher accounts manually or through HRfeed or SSO login.
  3. Use batch management to download the metadata from your stage instance. The resulting CSV is the template for upload. Note that private link is ignored when uploading.
  4. Create a mapping for your current metadata to the metadata fields in your new Figshare repository. You may need to combine some fields or split some fields.
  5. Set up a way to format the metadata to match the CSV downloaded via batch management. For example, use a spreadsheet program or script.
  6. Format the author names for upload: either add the user_ids or split the names to first and last name with all the right formatting (you can use the concatenation feature in spreadsheet programs to format). You may need to do this in a separate file and then add the formatted author info to the ‘authors’ column.
  7. If you are also adding funding ids, add the formatted values to the ‘funding’ column.
  8. Transfer your metadata using your mapping into a new CSV.
  9. Add a column titled ‘files’ and add the public URL or FTP location for the file(s) belonging to each metadata record. Follow the formatting requirements!
  10. Split your metadata file into separate CSVs by group. This way you can double check group specific custom metadata fields, categories, etc.
    • Fill in group id in each group CSV under the group_id column
  11. Test migration in Stage using a subset of each group’s records (10 to 100 records)
    Important: When uploading the metadata CSV file:
    • Check all the date formatting before saving the CSV as your spreadsheet program may change the formats. They should all be YYYY-MM-DD.
    • If you are editing existing records, save the file as “UTF” when using Excel (not UTF-8 as this is really UTF-8-BOM), otherwise the system may create items rather than update existing items. You can check the encoding of your CSV file using these instructions - the Figshare system does not want any “- BOM” encoding.
  12. Check the records and make any adjustments. Retest if needed.
    • It might be helpful to use this Jupyter Notebook to delete a list of records from an admin account to start with a clean slate each time
  13. Migrate records to Production
    • Going group by group, upload the CSV file and publish the records


Adding files

Add a files column to your CSV


Files need to be available to the Figshare system whether from a web server or an FTP location. Add a column called ‘files’ to the CSV and add file URLs like this:


["https://journals.aqs.org/pdf/10.1103","ftp://mirror.easyname.at/ubuntu-releases/robots.txt"]


Here's an example:

 

files
["https://journals.aqs.org/pdf/10.1103","ftp://mirror.easyname.at/ubuntu-releases/robots.txt"]


Note: This example would upload two files, each from a different location, to one item. The formatting is important: lowercase column name 'files' and the file location URLs enclosed in double quotes, separated by commas, and surrounded by one set of square brackets.


How do I make files available to Figshare?


Files saved on your computer are not available to Figshare- there's no way for the system to access your computer and then know where the files are. The files need to be available on the internet in a way that the Batch tool can access them.


A file on the internet has an associated URL or link. You can test if a file URL/link will work for Figshare by pasting the file URL into your browser and hitting enter. If your computer downloads the file or previews it in the browser rather than opening it on a special webpage or taking you to a landing page, you should be able to use it for upload.


If migrating from an old repository, the files may already be publicly available and you can use the file URLs from there. 


If the files are not publicly available anywhere, you will need to make them accessible to the Figshare system. 


Note: No matter what method you use, you need a way to match up the file links with your upload metadata!


Here are several suggestions: 


Ask your institution's IT department

Your IT might be able to make this easy for you. If they can make all your files available for the Batch tool, you can avoid doing this yourself. They might be able to provide an FTP server for you to use. On the other hand, why not learn a new skill and cultivate some independence from on IT? :)


Use a web server

If you are familiar with setting up webpages using Wordpress or other platforms, one option is to add the files to a web server and match up the URLs for each file with your upload metadata. A downside here is that your need access to a webserver and the files will temporarily be publicly available to anyone. Some institutions let staff and students create personal webpages using institution resources and this may work for you.


Using FTP  

Using FTP is fairly straightforward but does require some testing so you understand how it works. Every Figshare account has FTP credentials to a Figshare FTP server. This means that you can add files to a cloud server that are then accessible to the Figshare system.


The Figshare FTP system is not designed specifically for use with the Batch tool, rather it is designed to help users upload really big files or lots of files to one or more new draft items in their account. But, you can use this system with the Batch Tool by doing the following:

  1. Follow the instructions to connect to the Figshare FTP server (do this on stage for testing and then create a new connection for production when you're ready to go)
    1. Really important note: If your Figshare account uses SSO for sign in and you clicked the button to create an FTP password, make sure the password does not have any forward or backward slashes as these may cause issues later. Recreate your password until no slashes appear and use this for the next steps. 
  2. Figshare provides a 'data' folder for you. Don't create any subfolders here! If you do this, Figshare will create draft items in your account each titled with the subfolder name. Instead, add the files for upload directly to the 'data' folder.
  3. Once they are uploaded, highlight all the files in the data folder, right click (or command click) to see the little pop up menu and select the option to copy the URLs to your clipboard.
    1. Each file URL will look like this: ftp://3801041@ftps.figshare.com/data/thisIsMyFilename.pdf
  4. Paste these URLs into your upload CSV, format them with brackets and double quotes and match each to its metadata row.
  5. Add your FTP password to the URLS manually or using Find/Replace: 
    1. Add a colon after your account number and add your password. It should look like this: ftp://3801041:myPassword@ftps.figshare.com/data/thisIsMyFilename.pdf 
    2. Note: This is where that password caveat from above is important. Certain characters will make the URL unusable.
  6. Each row in the CSV should have one or more file URLs formatted like this: ["ftp://3801041:myPassword@ftps.figshare.com/data/thisIsMyFilename.pdf"]
  7. These files will be available temporarily since Figshare cleans out the FTP server regularly.