How to search Pubmed Central or other fulltext literature base using regex?
0
0
Entering edit mode
Abiologist • 0
@2b534cfa
Last seen 1 day ago
Poland

I'd like to search plant science literature (full text) to only return articles in which the word "three" appears four or more times in the full-text Methods section. Is there any way to do this at all ? (I mean with any language, online App or helper tool ?). I presume that I would need to search Pubmed Central (although another online database would be o.k. or perhaps I would need to download a database first ?) and I would prefer an R solution (although another language would be fine as well, perhaps using json and regex ?). Perhaps this might be possible using R Biotea ? perhaps a SPARQL query with regular expressions using something called an RDF database ? ChatGPT suggested first downloading data from europepmc and then querying this - but wouldn't this mean first downloading terabytes of data ? (perhaps this would be possible (?) but it doesn't seem like an efficient answer - does anyone have experience of this ?) At the moment I'm just enquiring whether it's at all possible - and which is the best/easiest direction to go. I've looked at scite - and this accepts json and regex - but apparently only searches citations rather than full-text methods.

metaMSdata Bioconductor • 139 views
ADD COMMENT
0
Entering edit mode

This is off-topic as entirely unrelated to Bioconductor. You seem to be crossposting this to > 1 community, please at least leave a link to the other communities so people do not invest double effort in case it's solved already elsewhere.

ADD REPLY
0
Entering edit mode

Apologies - you are potentially correct (assuming there is no answer within Bioconductor). I'm having trouble finding a suitable venue for this question as I have no idea which bioinformatics tools to approach for such a question. As soon as I have a suitable link I will link the question.

ADD REPLY

Login before adding your answer.

Traffic: 1398 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6