Advanced uses of Zotero

I mentioned doing a workshop covering some advanced uses of Zotero that included using searchable codes for organizing writing projects, documenting search strategies for formal lit reviews, and using the results list importer and the multi-format exporter for citation analysis projects. Here is a summary of my notes.

using searchable codes for organizing writing projects – this can be used in any citation manager that has a note or tag field, but it works particularly well in Zotero because the tags feature is separate from other keywords. I learned this tip from a PhD student who was using Endnote, back in the day.

Come up with codes for each section, argument, or other organizational feature of whatever you are writing. They should be unique to the project (i.e. not “Intro”) because once you've tagged citations with them you will be able to search for the tag and pull up everything you meant to use in that section.

The key here is that Zotero is searchable, and it's searchable using ANYTHING that you put into it. Use that.

I've never matched my friend's very elaborate coding system for his dissertation, but the idea has been handy for projects where I'm juggling a lot of citations.

documenting search strategies for formal lit reviews – in a formal lit review (systematic, integrative, etc.) you generally want to keep track of what you found using what search, but you also need to be able to eliminate duplicates and not get overwhelmed. Zotero to the rescue. In the process I worked out with a couple of researchers, a new folder is created in Zotero for each search. The results (yes, all the results) are exported from whatever database or search engine being used – set the results pages to display the greatest possible number of results per page and hit the web importer, going page by page.* Then create a new folder and do the next search etc. Each folder has the exact number of results per search.

Once all the searching is done, run the de-dup tool. De-dup merges entries, but keeps the merged entry in the original folders. At this point, create a Review folder and copy all the entries into the folder to run any additional review/selection criteria against. After this, you don't touch the search folders. They are your archive. The ones that make it through the additional selection get put into yet another folder, ready for the final analysis (whatever that is based on the purpose of the review.) This step can be done all at once or step by step if the selection criteria are more complicated. Once each step is finished and the results are in the folder for the next step, that previous folder is left as an archive. Never remove anything from a folder, just create a new folder for the selected entries.

If the review is being done by more than one person, separate folders can be created for each person's selection criteria phase or they can be done by consensus (tagging can work for this.)

If additional searching turns out to be necessary, you can add more search folders, run the de-dup again, and add the new results to the review folder. If the selection criteria need to be adjusted, the original review folder is still there, untouched, ready to be reanalysed.

There is a trail of exactly what was done, what the results were, and Zotero tracks when things were added to the library, so there is even some chronological tracking.

using the results list importer and the multi-format exporter for citation analysis projects – Citation analysis projects are looking for trends within a collection of citations. Those might be the results of searches, or the references used in a particular body of literature, or... One of the early projects I helped with of this sort was an investigation into the use of certain words within a field of study. Once the search was finalized and tested (did we want titles, abstracts, anything else? etc.) all those results were imported into Zotero.

Zotero is useful for this because not only is it relatively easy to import large numbers of citations, but it's also easy to export them into analyzable formats like .csv files. Obviously, if you don't need to clean up your citations, and your chosen analysis program can read bibtex or whatever format you've already got, you don't need Zotero here.

First – getting things in. As mentioned above, results pages from searches can be imported relatively easily by setting the number of results per page to the highest available and using the web importer ('Select All'). (* again) You can also be more selective and collect citations within the database (assuming that you can mark

Sometimes, it's not a search, but an existing collection of citations, like reference lists. If those are available in a machine readable format, like RIS or bibtex, everything is easy. But often it's text. I just started using AnyStyle.io and it works quite well. Export the results in bibtex and Zotero adds them with no problem. If needed, some clean up can be done in Zotero, or citations can be filled in by searching in standard databases, importing the results, and de-duping. (I was using the review feature in Mendeley to do this, but I found that because Mendeley is checking against it's own database of user citations, there is no guarantee that the citation is going to be any better than the one I started with and sometimes I got the wrong citation entirely.)

Second – getting things out. This is really easy with Zotero. If your library is entirely one project, just Export the Library in whatever the most useful format for your analysis. Probably that's CSV, but there are many options. If you are only exporting a portion of your library, it's almost as easy. Select the citations for export, right click (CTRL-click for Mac) and choose Export. Once you have your CSV or whatever file, you can import it into your chosen analysis program, whether that's Excel, R, Python, or whatever, and see what trends your citations reveal.

*This generally works better with a good connection and NOT including PDF files. It also encourages good search practice because, yes, we're talking ALL the results. It also may trigger downloading limits in some databases, so you may want to check with the vendor before doing a really big project, especially if you are pulling full text and not just citations.