Jekyll2024-03-12T21:34:29+00:00http://www.bertforhumanists.org//feed.xmlAI for HumanistsThe AI for Humanists project is developing resources to enable DH scholars to explore how large language models and AI technologies can be used in their research and teaching. Find an annotated bibliography of research papers and tools, a glossary of relevant terms, code tutorials, and information about our workshops.Google Colab Notebook Tips2021-06-03T00:00:00+00:002021-06-03T00:00:00+00:00http://www.bertforhumanists.org//quick%20tips/colab-tipsHere are some tips for running BERT in a Google Colab notebook. If you run into strange error messages, if your model takes forever to train, or if your notebook keeps crashing, you should check make sure you’re following each of these tips!
When preparing your data, you must use a tokenizer that matches your pre-trained model (cased vs uncased, BERT vs DistilBERT, sequential vs model, etc.).
Re-load / re-initialize models before re-fine-tuning with different parameters.
To save space, delete your models when you’re done with them.
To avoid running out of memory, use lower batch sizes and use DistilBERT.
To take advantage of GPU, attach model to device and set the runtime to GPU.
When using very small datasets, lower the number of warmup steps.
Use a very small learning rate (~5x10-5).
Factory reset the Colab runtime if CUDA is running out of memory.
]]>Maria AntoniakWhat is BERT?2021-01-20T00:00:00+00:002021-01-20T00:00:00+00:00http://www.bertforhumanists.org//introductions/IntroductionBERT is a state-of-the-art NLP method trained on a very large dataset of texts—namely, the entirety of English-language Wikipedia (2,500 million words) and a corpus of English-language books (800 million words). Thanks to this large amount of training data and its unique neural network architecture, BERT—–and subsequent methods like it (e.g., GPT-2)–—can understand human language significantly better than previous NLP methods. For example, BERT can identify whether a sentence expresses positive or negative sentiment, predict what sentence should come next in a paragraph, and disambiguate between multivalent words with never-before-seen levels of accuracy.]]>Melanie Walsh