rhedreen

Random writings

After we’ve been tenured (or after our last promotion) at my university faculty are evaluated every 6 years. This is my year. It’s a simple thing, just lists of accomplishments and an updated CV, but it also includes a narrative. So here it is.

I started working at Southern in 2004, just about 5 years after I’d gotten my MLS. I’ve been at Southern for 21 years, which is the longest I have ever been anywhere in my life. Those 2 decades have been a roller coaster, but Southern is in a better position than it was when I started, and I like to think that I’ve had some influence in that. Six years ago was the fall of 2019. It seems like a different world in many ways. I’d just finished my previous assessment file as September started to wind down, we’d just hired Diana Hellyar as STEM Librarian and I was looking forward to helping her take on most of the science departments I had been liaising with for years. This would leave me with more time to concentrate on Biology, Nursing, and Psychology (that last was new to me.) I was taking the last class for my MSBio degree and hoping to finish my thesis in time for spring graduation, and we had no idea what was looming on the horizon.

However, I am ridiculously proud of our efforts in Spring and Fall 2020. I’d been beating the drum of “we all have to work with online students and faculty” for years, and I guess it paid off. The Library pivoted to online quickly, and, if not seamlessly, at least with comparatively few internal hiccups. I was already doing online instruction for classes and individuals, and had worked with many of my colleagues to get them doing online work as well. We had many of the tools we needed, like web chat and patron pick-up scheduling, we just hadn’t deployed them robustly before.

Our stats plummeted for Fall & Spring 2020, of course. Everyone’s did. There were just fewer people doing anything besides just trying to get through the days. Professors re-wrote assignments to use fewer resources, and students got by with the minimum. I remember working with a student who was having trouble accessing library and class resources. She came in and everything worked fine on campus. It quickly became obvious that her problem was 6 people on a single wifi connection at home (I think it was 4 adults working from home and 2 kids in school online.) Which, honestly, was a problem we’d run into before, just not to that extent. The Library has always been a haven for those with crowded or noisy environments wherever they called home.

Besides the internal Library events and projects, I was also co-chair of the NECHE accreditation subcommittee for Standard 7 (libraries and facilities) for the midterm self study. This is definitely one of those “invisible labor” projects – tons of work, lots of nagging, emails galore, and, in the end, a single report that gets a “thank you” memo in your inbox at some point. Highly necessary, but exhausting and seemingly endless.

2020 was the year we all tried to figure out how to do conferences strictly online. One gem was the Force 11 Conference, on scholarly communication. I’d been wanting to go for years, but heading to LA in August was a hurdle, as well as an expensive trip. In 2020, they put their whole program online in December and had the best attendance ever, with attendees from the entire globe. While I appreciate the opportunity to chat with colleagues in the hallways and meet new people over lunches, I also really appreciate the ability to sleep in my own bed.

Another fantastic opportunity that I was able to do online was the Evidence Synthesis Institute for Librarians, offered online in August 2020 by librarians at Cornell University. The institute was designed to provide the needed expertise to work on evidence synthesis and systematic review projects as a librarian and search expert. Among other things, this clarified my view of librarians as experts in our own right. The term being used now is methodologists – experts in a particular methodology. I set up a guide based on the one at Cornell Library and started to explore how I could do a soft release of a consultation service and possibly take on full co-authorship for a few projects.

2021 felt like a slow recovery, the type that stalls and has little reversals, and in which it becomes hard to see where you’ve been and how you got where you are. During this year, I continued to work with the Office of Online Learning to do classes and videos about teaching and learning online, continued to work on the SCSU Authors project, and was finally able to get back to my thesis. I had switched the Sit Down & Write writing groups to online and it became the tool I needed to finish writing my thesis. Since I was facilitating, I had to show up every week! I also discovered things about my writing process that I hadn’t realized before. (Which just goes to show, it’s never too late to learn.)

In 2021 my colleague and friend Winnie Shyam retired. She had been Head of Reference/Research & Instruction for the entire time I’d been at Southern, so it was a wrench and a thrill to say farewell and to be asked to take on the position of division head. One of the first things that I implemented was Springshare’s LibStaffer for our desk schedule. Such an improvement over cumbersome Excel spreadsheets and awkward Outlook calendars! It’s definitely one of those things that once you have it, it’s hard to understand how you coped without it.

I finished, presented, and defended my thesis in fall 2021. As I’d planned, I released the thesis and the dataset openly, via the Zenodo repository platform. My last two presentations on this were definitely the most fun – speaking to the Biology Department about FAIR data and how my project met those goals (Findable, Accessible, Interoperable, and Reproducible) and speaking to a high school ecology class (taught by one of my biology classmates) about the field of historical ecology.

In the fall of 2019 I’d roped Diana Hellyar, as a librarian new to SCSU, into our SCSU Authors project and she became a crucial collaborator for the database and the publications that came from it, including our 2021 ACRL Science & Technology Section poster with Scott Jackson from Institutional Research.

Fall 2022 was the first year I added Health & Movement Sciences to my list of subjects. That department had originally been Physical Education in the College of Education, and over the years had morphed into a more health science specific program and moved to HHS. However, our Education Librarian, June Cheng, kept her role as liaison librarian even after they switch colleges. When she retired it seemed like the perfect time to switch librarians, and I took it on. I also released my Evidence Synthesis guide to a select group of faculty and have worked on some fascinating projects, none of which has (yet) seen publication, however.

In 2023, the article for a project that a group of nursing librarians, including myself, had been working on since 2019 finally came out. We had been trying to investigate the impact that difficult search assignments had on nursing students and new nurses, most particularly the “5 year rule” (only use studies published 5 years ago or less.) Do frustrating assignments lead to a disinclination to do literature searches? While we were not able to interview active nurses during the time (trying to get a nurse to do anything other than work and survive during COVID was a pretty big ask) we were able to interview nursing librarians, students, and faculty, so our project turned into more of an exploration about why the common search restrictions were used and what was the reaction to them. Librarians and students expressed frustration and when we asked faculty why they used what they did, the answer was essentially, “That’s what I had to do.” We hope that it gives nursing librarians and faculty some things to think about when giving those common evidence based practice assignments. It’s certainly informed my own nursing library instruction. I also presented our results at our SCSU Faculty Tapas event in Fall 2022, and some of my co-authors presented at the Medical Libraries Association conference in May 2022.

2023 was also the year I had 3 interns, 2 in the spring and 1 in the summer. One of the spring ones was supposed to join us the previous fall (2022) but had to postpone, so I ended up with 2 that spring. Jesse, Jenn, and Ariss were all a joy to work with and they tested out a new internship tool I created, the Intern Professional Development reading/viewing list. Each intern was to read or watch something (or find something themselves) and we’d discuss it. Reading or viewing time counted as internship hours. I’d been alarmed at the number of librarian applicants who didn’t seem to have a professional development practice (at least not a planned practice.) ILS students were telling me that while professional development was mentioned in classes, no one ever really specified what it was or how you did it. I was determined that my interns would gain a practice of weekly professional development.

Those three 2023 interns gave me great feedback on the initial list, and I expanded the list and ran Alexis, my Spring 2024 intern, through it as well. That went so well that I worked with all 4 of them to write an article for the new Journal of Graduate Librarianship. That was published as an editorially reviewed practitioner piece in the fall of 2024 (and it’s already been cited!)

2024 was the that AI became a general concern. OpenAI released ChatGPT in the fall of 2023, and by the beginning of the spring semester 2024, it seemed like everyone was panicking. While I’m annoyed and wary of all the “AI slop” that’s produced (have you tried to look for product reviews recently? Ugh!) I’m also happy that it’s produced some of the best information literacy conversations I’ve had outside the library field in decades. From a library perspective, I’m not sure that AI has really produced anything that didn’t exist before, but it’s accelerated the production of problematic stuff immensely. More people are seeing and being affected by bad info than previously (and maybe it’s also easier to blame “AI” than to admit that our previous information sources were flawed.) However the answer isn’t to reject AI or to accept defeat. It’s to work on classic information literacy skills and strategies: where does this come from, who was involved in creating it, what was the intent and intended audience, is this useful to me for my own information needs. With this philosophy, I’ve given talks and demos, joined groups on campus, in the system, and within the profession at large, and collaborated on the AI guide that my colleague Amy Jansen, Business Librarian, put together. If librarians can use AI to finally get information literacy concepts into curricula, then it may all be worth it.

While all the AI craziness has developed in the online world, R&I worked to make our physical Research & Information desk more visible and accessible. We moved from the side space near the IT helpdesk to the back of the 1st floor, further away from the front doors but visible from them. Overall, we are liking the new space and are moderately confident that we are more visible. It’s unclear if that is reflected in the statistics so far (we’ve been in the space for less than a year) but anecdotally we are talking to more people and have had at least 2 students who had never found us previously exclaim how much they liked having the librarians so available.

During 2024 I did a full overhaul of the Health & Movement Sciences Guide as part of a project to work on curriculum mapping for library instruction in the department and revamp the guide to better work with that curriculum. From a format-based guide (books, journals, databases, websites) that I inherited from the previous librarian, I moved to a topic-based guide, with the topics based on the curriculum. Statistics are low, but I haven’t done much marketing to the department yet. I worked on a similar update in summer 2025 for Psychology.

2025 has seen retirements and other leave takings, leaving us more short handed than we’ve ever been in my years at Southern. That’s put a great deal of strain on the R&I division and I see my current role as head to, as much as possible, ease the impact on my colleagues. We need to take some very hard looks at what we do and what we can do, especially with the designation of SCSU as an R2 university and the anticipation of more complex demands for our librarian skills.

This past summer I started in a wonderful collaboration with the School of Graduate & Professional Studies. My librarian colleague Amy Jansen, Tess Shapiro-Marchant from Political Science, and Cheryl Durwin from Psychology worked with Jessica Jenson from GPS to do a series of workshops on writing and research over the summer. It’s continued into the fall and looks to be poised to become a regular series.

From COVID to AI; from working exclusively online to emphasizing our physical presence on campus; an additional graduate degree; exploring information usage by nurses and instilling professional development habits with ILS interns…it’s been a busy 6 years. It’s been a hard time personally as well, losing my mother and stepfather (and 2 cats), and having several health crises. I continue to try and be hopeful for the future, even when optimism is challenging, because I believe in our University mission and that our students, faculty, and staff make a positive difference in this world.

What is “AI”?

  • Semantic Analysis vs. LLMs
  • Image analysis vs. Art/graphics generators
  • self-contained vs. open to internet

What can they do?

  • LLMs generate “plausible” text: likely vs accurate. This means that all results should read well and not have many obvious flaws – the flaws may be *much* less obvious!
  • LLMs have a “temperature” setting that affects how tied to training data they are.
  • Prompt wording is important – use the prompt to set the circumstances, the type of results, the audience, and any details. (Note: some models will ignore things that they can’t handle; try rewording, or a different model/platform.)
  • All generator systems are dependent on the training data; bias in/bias out. Commons errors are repeated; common stereotypes are emphasized.

These are **Tools**

Ethics

  • training data/copyright
  • training processes
  • privacy
  • “replacing” humans

Learning more

  • plenty of free webinars and short courses/workshops available
  • beware both hype and doom
  • try things out!

Tools to try

Went for a New Years walk in the local park to look for New Years inspiration (stuff to think about; my preference rather than resolutions.) The things that caught my eye were a very large tree that came down due to wet soil where the local river has been overfull, and this shimmery reflection of trees in the river. A reminder that change is ever-present, but it can be catastrophic or beautiful, or even both at the same time.

This summer (2023) I tried out some generative large language models for a couple of text manipulation tasks that have been plaguing me for some time: converting text citations into machine readable bibtex for importing into citation managers like Zotero, and extracting bits of text data from semi-structured text (my thesis project.)

tl;dr: It can work, sort of. But I don't think I'll be using any of the models regularly for this sort of task.

For my experiments, I used ChatGPT 3.5 (mostly the May 2023 version), the one freely available on the web, and I tried out several downloadable models in GPT4All.io (available for Windows, Mac, & Linux.) Part of the experiment was using easily available tools, so, yes, I know that I can access ChatGPT 4 using a paid for account and/or paying to use the API with something like python. I would probably get better results with those, but they aren't something that I can simply recommend to the average student or faculty member without more background or training. Another caveat, the local models require a LOT of processing power to work well, so that probably affected my results as I was using a several-years-old MacBook Pro with only average processing power.

The citation task is something that comes up a lot in my librarian work, especially when introducing someone to a citation management program, like Zotero. Everyone has text-based citations laying around, in papers written or read, but you can't just copy and paste them into the programs. If you have IDs, like ISBNs or DOIs, you can paste those in, but for citations that don't have those handy shortcuts, you are generally limited to searching for an importable record in a library database or search engine, or manually entering the info. I wanted to see if generative AI could assist with this. Besides formatted text citations (mostly APA, but some other formats) I also tried pasting in info from a Scopus CSV download – because that was a specific question that I'd gotten earlier.

ChatGPT did pretty well. It recognized the bibtex format and could reproduce it. It often needed me to identify (or correct) the format (book, conference paper, article, etc.) but it takes suggestions and spits out corrected feedback. The difficulty came with the “generative” part of the model – it makes things up. It “knows” that article citations are supposed to have DOIs, so it added them. They look fine, but aren't always real. It also made up first names from authors' first initials. It mostly obeyed when I instructed it to NOT add any additional info than what was in the citations, but that would have to be something double-checked for every citation. Which is fine if you are doing a handful, but a tedious task if you are doing hundreds. It did work well enough to add to my Citations from Text Protocol document (https://dx.doi.org/10.17504/protocols.io.bp2l6bkq5gqe/v2) with a comment about the incorrect additions.

The various models in GPT4All didn't do so well. Most of the available models couldn't recognize or produce bibtex format. Several only seemed to be able to reproduce something similar to APA style, while others just repeated back whatever I had put in the prompt. This is mostly likely a factor from the training database – I'm quite sure that if the training data included more citation formats, at least one of these models would be capable of doing this task.

The 2nd task is one left over from my thesis work (https://doi.org/10.5281/zenodo.5807765) compiling ecological data from a series of DEEP newsletters. Within the paragraphs of the newsletters are sightings data for gamefish in Long Island Sound, often with locations and sometimes measurements. At the time, I queried a number of people doing similar work with text-mining programs but couldn't find anything that could do this sort of extraction. This seemed like something an LLM should be able to do.

Again, ChatGPT did better than the locally run models, probably (again) due to the processing power available. The GPT4All models mostly extracted the info correctly, but they couldn't format the info in the way I needed it. ChatGPT was able to do that, and was even able to handle text from older documents with faulty OCR. But, it was irregular. I sometimes got different results from the same text. And it never found everything that I pulled out manually. ChatGPT did not insert additional info in any of my tests. This is a task that I would be curious to try with a more advanced model.

While my test results were lackluster, I see promise in this work. But not enough promise, right now, to be confident using the programs or to counter the ethical dilemmas inherent in the current models. OpenAI (the producer of ChatGPT) employs thousands of low wage workers, mostly in the Global South, to screen inputs for its model, and uses a huge amount of energy and water to run and cool its servers. This is something that anyone seriously contemplating using ChatGPT, or other LLMs and generative AI models, for research tasks should consider.

Possible title/topic: Advanced note-taking for writing: personal knowledge management – for (advanced undergrads, grads, faculty); writing for theses, dissertations, publication

Note-taking serves 2 purposes for writing: 1. Writing for retrieval – writing down things you want to be able to find again. 2. Writing for thinking – writing as a way of digesting, conceptualizing and putting info in context (especially your own context.)

Writing for retrieval Proper identification of sources – citation info (formatted or just complete); source (links, files, etc.); include info on finding (search engines/databases, search terms, search or access problems, etc.) Description – tagging or other searchable text that includes context; why is this important to you; quotations (with complete citation and page numbers!) or annotations; think how you might be looking for this later on. Storage – digital storage means that you can make these notes searchable; there are many tools specifically designed for note-taking (OneNote, Evernote, Joplin, etc.) and ones designed for knowledge management (Obsidian, Logseq, Roam, Tana, etc.), but you can use any searchable file – including Word or Google docs. However, I recently had a conversation with someone who wanted hardcopy only, and we thought about writing notes (with source info) on post-its, kept originally on the printed docs, then moving the post-its as needed to a notebook or whiteboard for organizing and as writing prompts. Look up “Zettlekasten” for high-end hardcopy note-taking!

Writing for thinking Context & Connection – how would you use this; what does it remind you of; what other things do you have notes on that might be related (search and link!) Summarization – rephrase and summarize (you can't do this if you don't understand it.) Combine with other info; be explicit about connections. Don't just paraphrase by swapping out words – write your own understanding. Many people recommend individual notes for specific references (or specific ideas from specific references) and then separate notes for concepts that link/reference to the individual source notes. Update the concept notes as you collect new sources, and split off new concept notes when you have a new concept or aspect to explore. But link back to the old concept notes and to any relevant source notes.

Using notes in writing Avoid the blank page – copy useful notes (usually source notes for lit review or analysis; concept notes for more topical paragraphs) into a new document and/or into an outline to start. Save extraneous ideas for later – if something needs to be cut, you don't have to throw it away (aka “Killing your darlings”), just make a new note. This sort of note-taking should be iterative. Write some notes; this gets the ideas into your brain. Your brain can process and digest the ideas and produce new insights. Write down those insights, connecting to new and old info. Repeat.

Someone just described ChatGPT as a tool that shows you what an answer looks like (rather than giving you an accurate answer.) Which sounds useless.

But this is something that as a librarian I have a problem describing to students. In traditional (non-LLM) searching, you need to search using the language of the answer. So, for example, I tell my biology students to search using the Latin species name of an organism, because that is more likely to result in scientific articles.

So a possible use would be to show you what language the answer is likely use, how the language is used, and what related concepts you should think about. The trick is how could that be presented in a way that doesn't lead to the short cut of “That sounds like a reasonable answer. I'll stop here.”

One way is Elicit's “Suggest Search Terms” task – put in your question and Elicit pulls out common keywords and phrases. (Elicit requires an account, and it's a little hard to imagine that it's going to remain free forever.)

I heartily dislike both the AI-hype and the AI-doom-and-gloom that makes up most of the popular reading on ChatGPT and other large language model tools. These are neither the best thing since the personal computer, nor the thing that will bring higher ed crashing down around our ears. They are tools, and like all tools have appropriate and productive uses, inappropriate and concerning uses, and things for which they are not suited (but that people are trying to use them for anyway.) You can use a chisel to do wood carving, or you can use a chisel to break open a locked door. The latter might be a good thing or a bad thing depending on circumstances. You could also try to use a chisel as a screwdriver, and it might work for some screws, but it's not the best tool and you are likely to hurt yourself if you aren't very careful. (And it's not that great for the chisel, either.)

In my own experimentation, I've come up with some use cases that I think fit into the 'appropriate and productive' category. The one thing that I've found so far is that these tools are most useful for manipulating text. The real 'generative' uses seem to me to be very superficial. 'Produce a [blank] in the style of [blank]' is fun the first few times, but not very interesting overall. And mostly kind of bland (which, as someone pointed out, should only be expected: GPT produces essentially the most likely or “average” text for a given prompt (1).) More like 'produce a [blank] in the style of [blank] on a bad day.'

Here are what I've found useful. I'll add to this as I find new uses.

  1. Translation. ChatGPT levels up translations from the standard translation engines available on the web. Like all machine translation, results are a bit stilted, colloquialisms can be confusing, and less common languages give worse results, but overall, I'm pleased. I suspect that all the translation engines will be incorporating LLMs (if they haven't already) and we should see improvements in the applications soon.

  2. Text mining. I was very excited to find that ChatGPT could extract semantically useful info from narrative text. This is what I did my thesis on, and a very tedious 20K entry dataset it ended up being. I'm eager to start comparing the GPT-generated results to my previous work and to add to my dataset with new entries as soon as I'm satisfied with the quality.

  3. Search assistance. I probably shouldn't have been surprised that ChatGPT could extract semantically useful info from text, since that's exactly what the 'research assistance' apps like Elicit and Consensus do. Those specialize in analyzing research papers and pulling enough info out for you to figure out if the paper might be useful for you. Both are in heavy development right now, but can do things like extract text that is related to an actual question or pulling methodological info out of a selection of studies (population size, analysis techniques, etc.) (2)

  4. Transforming text into machine readable formats. Since ChatGPT can do translation and can also “produce in the style of” it stands to reason that it should be able to manipulate text into a particular format. And it can, at least with formatted citations into bibtex, one of the importable file formats that citation managers like Zotero uses. It would be tedious to do a lot of them at once because of the character limits, but I'm hoping someone will write a script using the API. I had hopes that it might be able to do the same with citations in a spreadsheet (CSV) but since the prompt box is unformatted text and limited characters, I couldn't get it to recognize more than a line or two at a time. It did a reasonable job on those few lines, however. Again, very tedious to do a lot, but it would work and might be suitable for some API scripting.

  5. Audio transcription. I've actually paid for a transcription program call MacWhisper that uses a variety of GPT engines to transcribe audio. It's a cut above the machine transcription available in most of the presentation tools (Zoom, PowerPoint, Google Slides) and it works locally, so it's a bit more private than the better services like Otter. It still has trouble with names, but it has a new feature that lets you put in commonly mis-transcribed words, so I can probably get it to stop misspelling my name and the CINAHL database (sin all, cinder, etc.) MacWhisper is Mac only, but I just saw a project called Audapolis that's multi-platform.

  6. Suggesting new wording. If you've got an awkward paragraph, or want to improve the flow and readability of something, ChatGPT does a decent job. One of the first things I tried with it was inputting a couple of paragraphs from something written by someone for whom English was definitely not their native language. The original was comprehensible, but stilted and had some weird word choices. I'd assume, given the translation ability, that it can do that for other languages as well. The results weren't spectacular, basically just bland American English academic prose, but definitely more readable. If I was doing this for anything real, I'd want to very carefully review the resultant text to be sure nothing had been factually changed.

  7. Avoiding 'blank page syndrome.' Since ChatGPT doesn't do well with producing accurate facts and references that exist, this is better using another tool. I found that Perplexity gave a decent summary of what might be called 'accepted internet knowledge' on a topic, and gives citations – and you can specify what sorts of things you want as references: scholarly journals, newspapers, etc. As I mentioned previously, Elicit and Consensus will both give text extractions and summaries direct from research papers. Any of this could be used to construct an outline or scaffolding to get rid of that blank page. ChatGPT can produce an outline, too, just be sure to thoroughly check any factual claims. Really, do this anyway – just because something is common on the internet doesn't mean it's right (Perplexity) and extracted sentences taken out of context may be deceiving (Elicit and Consensus.) In a way this is the reverse of #6: start with the LLM and have the human produce the final text.

These are all but the last one on the order of “take this text that I give you and do something with it” rather than “produce new text from scratch.” Only the last 2 stray into what could be considered to be a grey area ethically – how much of the writing is LLM-produced and how much person-produced, no matter which comes first?

Anyway, these are what I've found actually useful so far.

(1) https://www.washingtonpost.com/technology/2023/04/01/chatgpt-cheating-detection-turnitin/ (2) Update (2023-04-23): Aaron Tay has also been experimenting with using LLM's for search, specifically Perplexity, and he keyed in on the ability to restrict the sources used as well. https://medium.com/a-academic-librarians-thoughts-on-open-access/using-large-language-models-like-gpt-to-do-q-a-over-papers-ii-using-perplexity-ai-15684629f02b

While ChatGPT has gotten all the press, there are some “AI” (I don't actually like this term[^1] ) Large Language Model and Semantic Analysis tools out there that I think can help doing searches and finding literature.

In a theoretical search scenario, I think I'd start with Perplexity.ai (https://perplexity.ai; no registration required), an “answer engine.” It also gives you a short answer to questions, but, unlike ChatGPT, it's doing actual internet searches (or at least searching a reasonably updated internet index) and cites its sources. You can even ask it for peer-reviewed sources. This is a lot like using a good Wikipedia entry – get an overview, some interesting details, and some references to follow up on. It is, like most internet-based things, going to be biased towards whatever the majority of the sources say, so I could see it spewing some pseudoscience or conspiracy stuff out, but it does look like the programmers gave it some filters on what it uses for sources. As they say, “Perplexity AI is an answer engine that delivers accurate answers to complex questions using large language models. Ask is powered by large language models and search engines. Accuracy is limited by search results and AI capabilities. May generate offensive or dangerous content. Perplexity is not liable for content generated. Do not enter personal information.”

Then, I'd take those sources and plug them into a semantic/citation network search like SemanticScholar.org (https://semanticscholar.org; no registration required), ConnectedPapers.com (https://connectedpapers.com; no registration required), and/or ResearchRabbit.ai (https://researchrabbit.ai; registration required.) These look at the citation networks, author networks, and/or semantic relationship networks of scholarly works and display them in different ways to show you (what might be) related works. Most of these are based on SemanticScholar's database (as the most open and freely available scholarly source out there) so they mostly come up with similar results, but each has additional features that expand on the base. SemanticScholar's “Highly Influential” citations attempt determine works most closely based on the original work. ConnectedPapers looks at 2nd order citations (cited by citing articles or references, etc.) to identify what might be foundational works or review articles, and has nice network maps to explore. ResearchRabbit can look at groups of papers to find common citations and authors, and you can view results in network maps and timelines. If you register, all of these offer alerts, too, based on your searches.

Once I had a core set of works, I'd go back to the tried and true library databases, especially ones with subject headings/controlled vocabulary. Controlled vocabulary establish a single word or phrase for concepts within that particular discipline (MeSH for medicine, ERIC Thesaurus for education, etc.) Every work entered into the database is tagged with these “controlled” terms so that you can be confident that all the articles in Medline/PubMed about heart attacks are tagged with “myocardial infarction.” (There are some experiments with using semantic algorithms to tag database entries, but to the best of my knowledge all or most of the traditional sources still use humans as quality control.) By looking up some of the articles I found through the other sources, I could find the relevant subject headings and use those to search out more results.

Ellicit.org (https://elicit.org; registration required) is another GPT-based tool that bills itself as a “research assistant.” It's a little more complicated, but has some very interesting features. It pulls out quotes from the results that it determines are relevant to your question or topic. You can ask it for particular types of research (qualitative, etc.) or have it highlight aspects of the research, like study size. There are additional “tasks” besides the general search feature – one of which is to find search terms related to a topic! It's still very much in the experimental stage, but also very intriguing.

So...with all of these new tools, am I worried about being replaced by a librarianbot? No, I'm not.

  • Using these tools requires skill, which means either training or time and willingness to experiment. In my experience, most people want training and are happy to outsource the experimentation to people like me.
  • The scholarly publishing world is complicated and while open access has made a lot of stuff more easily available, it's also made things even more confusing. I get a LOT of questions about accuracy, currency, and other quality issues and I do not see those going away anytime soon. (And in the short term, I think those will be even more of an issue as tools like ChatGPT generate plausible but inaccurate text that gets put out there unidentified.)
  • Access is still a big issue, and librarians are the people that most institutional scholars turn to for access issues.
  • A lot of people like working with people or at least having a person available to them. It's reassuring to know a real person has your back. “Hand holding” (you know what to do but you want me to reassure you that you are doing it right) is a big part of learning, especially in this age of anxiety.
  • Most of these tools remove tedium. Younger scholars have no idea what I mean when I say that the biggest benefit of online databases is being able to search more than one year at a time. Only people who remember printed indexes (or at least CD indexes) can appreciate NOT having to search each yearly (or semi-annual) volume one after another after another... I'm quite happy not doing that any more. It means I get to do more interesting things, like working on systematic reviews with other researchers, or investigating new search tools, or teaching citation and note-taking systems.

Which means I'm excited for these new tools, as long as they are producing useful results. I'm less excited about tools that produce mis-information, like ChatGPT's made up citations[2] or the AI-powered voice synthesizer that everyone except the promoters predicted would be used for faking celebrity statements.[^3]

So go out and enjoy the AI (again, see footnote 1). No one is stuffing this genie back into the bottle, so we need to learn how to live with it. And it can make some things better.

[1] I don't like the term AI/Artificial Intelligence because we all grew up with science fiction and AI that was actually AI – artificial beings with what we could easily recognize as human-like consciousness and intelligence. (Putting aside for the moment the problem that we often don't recognize or want to recognize the intelligence of other human beings – often the point of those science fiction stories.)

[2]

[3] https://gizmodo.com/ai-joe-rogan-4chan-deepfake-elevenlabs-1850050482

Update (2023-04-23): Aaron Tay has also been experimenting with using LLM's for search, specifically Perplexity, and he keyed in on the ability to restrict the sources used as well. https://medium.com/a-academic-librarians-thoughts-on-open-access/using-large-language-models-like-gpt-to-do-q-a-over-papers-ii-using-perplexity-ai-15684629f02b

If, as implied by our Provost's email, a possible response to COVID-19 could include moving courses online, some provision for library services to those courses will be required.

Much depends on the exact nature of the quarantine or other restrictions. In other similar situations I have heard of vulnerable employees being placed on a leave of absence, closing the library to the public but continuing services otherwise, allowing staff to telecommute, etc.

From conversations with librarians at other institutions who have dealt with emergencies that prevented library staff from coming in (mostly during natural disasters), I can summarize the strategies:

  • It should be determined if the institution can provide additional disinfection supplies (hand sanitizer, disinfecting wipes, etc.)
  • Supervisors should determine what work can be done remotely and which employees can work remotely (i.e. have the required equipment and/or internet services to do so).
  • If employees need additional equipment (laptops, barcode scanners, etc.) that equipment should be identified and prepared.
  • Provisions and practice may be needed for subject librarians to provide reference and instruction online. It should be determined who can provide video conferencing vs. web chat, for instance, based on equipment, connections, skills, and experience. Practice sessions can be arranged.
  • Assuming that some staff are allowed in with the physical collections, protocols for requesting scans of library materials need to be established. -If staff are not allowed into the building, then extra budgetary resources may be needed for ILL and document delivery.

A useful article summarizing the library response to the 1918 pandemic and with planning advice is “In Flew Enza” from American Libraries (Quinlan, 2007).

Advance planning provides the best defense against emergencies and reduces stress for both employees and patrons.

Quinlan, N. J. (2007). In flew Enza. American Libraries, 38(11), 50–53.

(Updated 2019-06-27; updated 2023-01-16 to note Microsoft Academic's demise; also to note that I am no longer using Kopernio, which is now part of Clarivate's Endnote. I need to make a new list!)

A list of online tools, services, and software I'm finding useful.

  • Kopernio (Chrome, Firefox) https://kopernio.com/ Now owned by Clarivate Analytics (owners of Web of Science and Journal Citation Reports), Kopernio is a browser plugin (Firefox & Chrome) that can find PDFs of articles via both open access sources and library subscriptions. Also has online storage “locker” for PDFs. Obviously, some data harvesting is going on, but it really simplifies finding full text articles. The Google Scholar plugin (from GS) has similar functions (data to Google, obviously) and Lazy Scholar is an independent plugin that also allows you to access your library's subscriptions. Unpaywall and the Open Access Button are other plugins that only find open access sources. All good, but right now I'm trading data (with a reasonable data privacy policy) for convenience with Kopernio.
  • Zotero (Mac, Windows, Linux) https://zotero.org/ My preferred citation manager, and the only major one currently NOT owned by a big publisher. (Elsevier owns Mendeley, Clarivate owns Endnote, and Proquest owns Refworks.) But besides that, I prefer it for the ease of import and export. Generally, I find the import options work better and it exports in more formats (RIS, bibtex, csv, etc.) than other software. It comes with over 9000 citation styles, and reads the open source format CSL so you can modify or create any style you need if those 9000 aren't doing it for you. If you just need a quick formatted citation, try their online service, ZoteroBib, https://zbib.org/
  • Anystyle.io (online) https://anystyle.io/ Convert text formatted citations (APA, MLA, etc.) into bibtex or other machine readable citation formats. It's essentially a proof of concept, machine learning GitHub project, but I was able to take around 100 references from a dissertation and get a file to import into Zotero in about half an hour. Not bad for something “intended for non-commercial, limited use.” Certainly a LOT faster than if I'd been typing (or copying/pasting) them in, or even doing a search-and-import. It's not instantaneous, but you have a lot of control over the process, so you can fix errors before importing.
  • Texts (Academic Markdown Editor) (Mac, Windows) http://www.texts.io/ I'm trialing this markdown editor right now, and I am liking it. A minimalist writing tool, it uses markdown language (think HTML for more general text) to structure documents (and structure leads to formatting.) You can do all the major document structures: lists, numbered lists, headings, footnotes, quotations, and – unusually for these markdown editors – citations. (Citations are a little tricky, but once you've got the system down, it's not too hard.) Once I have my basic text, I can export in a number of formats, including PDF, RTF, Word, HTML, ePub, LaTex, and a very nice minimalist HTML presentation style that I would be very happy to present with at a professional conference. What's really neat is that the SAME FILE can be exported in multiple formats, so a properly structured text document can automatically make a presentation, etc. Those are the benefits of markdown; Texts just makes it all really easy. The only drawback I've seen so far is an inability to resize images. I think it's a 30 day trial, and I've seen a price of $19 in reviews (though I'm not seeing a cost on the website right now.)
  • Microsoft Academic Search (now defunct, 2022) Microsoft's answer to Google Scholar, with extras. Besides the general indexing, Microsoft has added a semantic analysis component, so that things like institution are parsed out of articles and become searchable. Each document entry includes the usual citation, abstract, and sources (OA direct downloads), but also how the document fits into the citation network (references and citing articles) and all the parsed topic, institution, journal, date, and author data. Plus, a “related” search that uses a semantic analysis to find similar documents. My librarian's heart is cheered by the fact that it also lists my research guides as scholarly documents (even if it did take a bit of work to “claim” all of them when I set up my profile.) Two drawbacks: 1) apparently you can't use your institutional Microsoft account (Office 365, etc.) to login – at least it's never accepted mine; and 2) there is no ability to link to an institution subscription login. (However, see Kopernio, above.) It's currently a smaller database than Google Scholar, but it's growing, and it has some very nice features of great use to researchers and students.
  • JSTOR's Text Analyzer (online) https://www.jstor.org/analyze Speaking of semantic search, JSTOR's Text Analyzer (beta) does that, too, and shows you what it's doing. You don't need a JSTOR subscription to use it, just go to the page and upload another article or paste in some text and see what comes up. Then you can play around with the search features to refine the search. (If you don't have an institutional subscription, btw, JSTOR has a number of options for independent scholars that are very affordable.) JSTOR has a nice write up on teaching with it, too, at https://daily.jstor.org/how-to-teach-with-jstor-text-analyzer/. Other suggestions I've run across include having students use the subject terms to help interpret and summarize an article. (JSTOR's own video implies that you can use it for that “I wrote the paper, now I need sources” style of academic writing. Librarians would prefer to discourage that, however.)
  • Publish or Perish (Mac, Windows, Linux) https://harzing.com/resources/publish-or-perish Anna-Wil Harzing wrote the Publish or Perish software to help academics document and discuss the impact of their research. You can get a citation report from all the major citation sources (some you need subscriptions for). She's got extensive documentations, including print books

Good post with more tools: https://medium.com/@aarontay/top-new-tools-for-researchers-worth-looking-at-9d7d494761b0