Comparing Traditional Databases With an AI Academic Paper Search Engine

0
88

For as long as digital computing has existed, traditional (c) databases have provided the same services to users (a) without fail as somebody who enters search terms into a database while trying to find out information on a specific subject. However, how many times have you searched and had to read through two or three pages of results that were not at all what you were searching for simply because you were not finding what you were looking for? I have done this so many times that I lost track of how many times that I did…and it was for that reason that I began to look into some other kind of a tool, or a way, where, when entering search terms, the system could actually understand exactly what the person entering the data was looking for and not just what they typed into a computer. Therefore, this is how I came up with the idea of starting to develop an academic journal search engine using artificial intelligence (AI). The method of querying will evolve from requiring strict syntactic patterns, to enabling more natural and conversational exploration of documents. To illustrate this shift in how academic documents can be searched or obtained, I will compare two distinct groups of databases – traditional databases (e.g. PubMed, JSTOR, Scopus) with an artificial intelligence-based (AI) document search engine called WisPaper – that is creating some buzz in the academic community.

The “Type and Pray” Method: How Traditional Databases Work

Traditional databases generally work on a basic principle: they index metadata (titles, authors, abstracts, keywords), sometimes full text, and use that index to compare your search terms. Think of it as a highly expanded library card system (i.e., your library has many more cards than your local library). Unfortunately, traditional databases are very literal in what they index and display, so unless you have the exact phrase “neural mechanisms of memory consolidation during sleep,” and if a reference uses the alternative term “hypothesis of synaptic homeostasis,” you won’t know about it. In order to find references, you need to know how to do the secret handshake (correct synonyms, correct Boolean phrases, correct subject headings).

When I was doing a literature review on urban heat islands, it took me ages to create the perfect search strings: “urban AND heat AND island AND mitigating”, then another one like “cool pavements OR green roofs” were both examples, and neither produced useful articles (i.e. soil moisture from agriculture) from either set of results (i.e. articles about planets or climate modelling). On a traditional database, it doesn’t have an understanding of your intent (i.e. that you care about people’s comfort), it just matches the letters that you’ve typed in as search criteria. It would be incredible if you knew exactly what you were looking for – after all, this is how people typically conduct research – but usually we go into research not knowing what we want to find out about. Therefore, this is the first key distinction when switching to an AI-based academic paper search engine.

The AI Difference: Understanding, Not Just Matching

Let’s discuss WisPaper, the newest member of the search engine family. Unlike traditional academic paper search engine that only index the words found in papers, WisPaper also attempts to understand the ideas behind those words. You can enter a question about how sleep deprivation has an effect on the long-term potentiation of synapses in the hippocampus by using everyday language: “What is the effect of sleep deprivation on long-term potentiation in the hippocampus?” The artificial intelligence (AI) used by WisPaper reads through your question and deciphers what entities (sleep deprivation, long-term potentiation, and hippocampus) you’re asking about. It then interprets what kind of relationship these entities have to one another (sleep deprivation is an effect of long-term potentiation) and searches through every paper in its database to find information that relates to each entity’s relationship. Therefore, rather than looking for just the exact phrase “long-term potentiation,” WisPaper is looking for anything that discusses changes in synaptic strength as a result of losing sleep.

For anyone doing exploratory research this represents a huge change in how to search for research studies. Recently I tried using WisPaper to look for something as vague as “how stress interacts with memory formation in older adults”. A traditional database would have completely crumbled under such casual language but WisPaper was able to give me a clean list of studies—some related to cortisol’s effect on the prefrontal cortex, some related to lifestyle changes, and even some looking at epigenetic markers—and even indicated how each of those studies related back to my query. The ‘intent verification’ (as WisPaper refers to it) filtered out ~90% of the noise according to the experiences of one researcher. This isn’t just another small step forward—the implication of these features represents a major paradigm change.

Building Your Personal Research Library: From Chaos to Control

A lot of traditional databases allow users to save citations, while exporting citations in programs such as EndNote, Zotero or Mendeley are indeed straightforward, it tends to be cumbersome and repeated via the format of ‘click, export, open reference manager, file & repeat’. Once a paper is saved to a traditional database, it remains almost untouched until the user decides to revisit it to read any particular paper. What a tool like WisPaper offers is to provide a way to search through papers using an artificial intelligence (AI)–driven academic research paper search engine and in addition create a library feature which acts as a private research repository (using a citation management system similar to Zotero). As a way to truly enhance the user experience, is that users can now search their own uploaded documents using natural language. For example, if a user wants to find out ‘what methodologies this paper uses’ or ‘summarize the limitations of this paper’ then with the AI application they are able to quickly search through all of their uploaded PDF(s) and be given a prompt return providing an answer instantaneously.

Using a cluttered file folder containing over 50 articles on climate change adaptation as a reference, I used my academic database search engine to quickly locate three articles that referenced the “adaptation pathways” method. This allowed me to avoid searching through each paper, saving time and maintaining my sanity; traditional search engines only give you access to download links – once you’ve downloaded them, you need to search for that article independently.

Proactive Discovery: When the Literature Comes to You

One more category of weakness when comparing traditional databases against modern alternatives for keeping you up to date is lack of notification capability. You can set up an email alert system with your selected search criteria and receive notifications based on that criteria each time a new paper is published containing the keywords you specified. However, many times the notification is only sending you an updated list of papers and not determining what is actually relevant to you. Many researchers find themselves unsubscribing from many notification systems because they end up inundated with irrelevant information to the point that the notifications lose all meaning and relevance to their work. On the contrary, an AI-enabled academic research paper database works in an opposite manner. Through an “AI Feed” process, you define specific search criteria (for example, “new advances in CRISPR delivery methods”) and the system will continually feed you the latest and most relevant research articles as they are published. This means that your notifications will come from an AI-enabled database that continuously searches for you every day and delivers only the top 10% most relevant papers to you—essentially working as a research assistant for you.

On WisPaper, I established a feed focused on “generative AI in medical diagnosis”. The following morning, I received a few studies, including a pre-print regarding a novel transformer-based architecture for analysing x-rays. It is very possible that I could have found this through my own searching in PubMed, if I had taken the time to adjust multiple different search terms for an hour or so until I saw this article hit my search results; however, my academic research engine had already provided me with access to it. Thus, I can say that the way in which the two different methods of research provide you with current information is different as well and will affect the manner in which you utilize those resources for future studies.

The Human Element: Testimonials from Real Users

You can rely on others to help you with your research. Dr. Li, associate professor in CS says “At last, I don’t have to waste my time looking through irrelevant papers; Intent Verification filtered out 90% of the noise with great accuracy. This is what artificial intelligence should be.” Sarah J.) a neuroscience PhD candidate states “IT IS A LIFESAVER WHEN DOING LITERATURE REVIEWS. I PUT ON A VERY BROAD TOPIC REQUEST AND NOT ONLY DID YOU FIND ME THE KEY FOUNDATIONAL PAPERS YOU INSTRUCTED ME ON WHY THEY ARE IMPORTANT”. Mark T. an independent researcher has completely confidence in this technology. These are not bots selling a product—the individuals quoted above have experienced first hand the frustration of conventional database systems and now have relief through a research engine that actually fits their needs/my needs as a researcher.

I really connect with Sarah’s remark that research papers should have explained their value. As stated, conventional literature databases show a list of results; when you click through each, you read the abstract and sometimes skim the introduction before determining their relevance. This can be very tiring. However, with an AI-powered engine, you can find the exact sentence or figure that answers your question without having to think very much about it. It’s like having an experienced researcher point at a research paper and say, “This paper is relevant to your hypothesis.”

Speed, Context, and the Joy of Discovery

Let’s discuss speed here—not just about how long it takes for a quick query to return results in milliseconds, but also about how fast you can go from having a vague idea of something to finally having a concrete understanding of it. Traditional databases require an iterative search process where you enter a query, review the returned results, revise your keywords and then repeat the whole process again. While the iterative search process works, it is slow! An academic paper search engine dramatically reduces the number of iterations needed between queries and results returned. You use natural language input, it uses natural language understanding to return relevant results to your input after only one query. You are then able to narrow down your search using follow-up questions, such as: “What paper has the largest sample size?” or “Only give me longitudinal studies.” The AI continues to refine the search throughout its entirety.

Discovering information is distinct when searching traditional versus AI databases. Traditional databases are transactional. You ask for something – you get the item you requested (or did not). An AI-based search engine provides a conversational / exploratory method whereby you can go off on tangents (example- I found an academic article about Sleep and Memory, what do you have that relates to Sleep and Emotional Regulation?). The academic article searching engine helps to shift context so that finding articles you are unaware of or did not expect to find like examples of serendipity, very unlike what you would find in a strict, predetermined Boolean string.

Practical Considerations: Is AI Always Better?

Traditional databases do have some benefits – they are standardized; they contain peer-reviewed literature; they are good for known-item searches (for example, “find the 2017 paper written by Smith et al. about apoptosis”). Some institutions only want you to use certain databases for tenure/professional evaluations or systematic reviews. However, if you are conducting day-to-day literature explorations, tracking your discipline’s current literature, or researching a new sub-field, utilizing an academic paper search engine such as WisPaper will result in a much more user-friendly and efficient experience than with traditional databases.

There is a drawback which is that your data set influences AI’s effectiveness; so if the AI has not indexed either a recent preprint or a niche journal then the paper may not show up in your results. While WisPaper claims to check the entire text within a publication and constantly keep its feeds up to date, no system can be perfect. There is a also the issue of trust; unlike traditional databases where you know how the paper(s) were ranked based primarily on a keyword match, the ranking by an AI system may seem much more arbitrary or lack any relevance connection to the original publication data entered into their system (do you trust). In addition to the “Intent Confirmation” offered by WisPaper, the AI system is a model. I would consider using an academic article search portal to be a great accompaniment but not a full substitute for now.

A Few Tips for Making the Switch

If you’re really considering trying an AI Academic Paper Search engine, I’ll tell you what I’ve learned. One: You should start off with a very general, unstructured Question – This is where AI really excels in terms of helping you find papers and information. Do not worry about having to be exact at this point. Two: Use your library function immediately upload your own PDFs and ask the AI to create a summary and/or categorize them for you. You will be amazed at how fast you can transform your disorganized folder of documents into an organized and usable knowledge base from your AI’s file. Three: Create separate feeds for your different projects/interests that the AI will do the daily searching for you to allow you to concentrate on thinking and writing. Four: Do not discard traditional databases altogether, you will still need them to verify information and obtain papers that may not be in your AI’s database of papers.

The Future of Literature Search

We are living in an inflection point. In the past, traditional databases were designed to operate in a time of scarcity where information was difficult to find and search algorithms were designed primarily for librarians. Today we are inundated with an excess of information; our primary activity now is filtering through it. An academic paper search engine that provides you with an understanding of your purpose (i.e., intent), learns from your database of papers (i.e., library) and then delivers relevant discoveries to you before you ask for them is more than just useful; it is becoming essential. WisPaper serves as an example of this type of service, and it will not be the last one to offer this type of feature. The shift from using keywords to find content to a search method that relies on understanding concepts is here to stay.

Next time you have to do a literature review, use both options. First search PubMed or Scopus for your benchmark information before searching WisPaper with the same criteria. Check the amount of time spent and the relevance of the results and serendipitous discovery produced by each search engine. From my experience, the AI-based research paper search engine will provide hours of sorting for only a few minutes of finding the information you need and time is critical in research work as you can never get it back. Let your AI researcher do the majority of the searches for you because you need to generate ideas, design experiments and write papers.