How to modify your encoder, attention mechanism, and decoder to increase model capacity and produce better results. Additional discussion regarding dataset size and other hyperparameters included as well.

Metamorphosis II [1]

Recap of Previous Work

Over the past two weeks, Ethan Huang and I have been striving to study, reconstruct, and improve a contemporary CNN+LSTM image captioning model to achieve a final architectural design that will allow life science researchers to efficiently look up previously documented molecular compounds. At present, many old papers contain diagrams of molecules represented in what is known as Skeletal formula, but these representations are not able to be well interpreted by computers…

Additional topics include managing massive datasets and reconstructing complex models developed by others.


As laid out in our initial blog post, we are currently in the process of researching, examining, and modifying deep learning models whose aim is to convert molecular structure diagrams represented in their Skeletal formula into their corresponding International Chemical Identifier (InChI) form so that previously published papers and articles can be more quickly and efficiently parsed by machine learning algorithms and the chemical compositions present in them can be documented more readily.[1]

How neural networks can be used to help convert molecular structure diagrams into their corresponding International Chemical Identifier (InChI) text strings.

Molecular Structure [1]

Background and Motivation

As we enter into a new frontier of predominantly digital media and publications, it becomes exceedingly paramount to learn how to reconcile the old way of doing things with the new. In the field of chemistry, it has been common practice for decades to represent chemical compounds by their structural forms in what is known as the Skeletal formula. Past publications are full of these diagrams, but, now, as we become increasingly reliant on having computers parse documents and…

How data augmentation, increased image resolution, and model ensembling can all lead to meaningful boosts in test accuracy.

Recap of Previous Work

Over the past few weeks, Ethan Huang and I have been endeavoring to build a convolutional neural network that will enable subsistence farmers in Africa to be able to more easily and efficiently diagnose their cassava crops and determine what actions would be most appropriate to take in order to stem the spread of disease amongst their plants and maximize their yields. Currently, many farm holders’ only option is to have local agriculture experts come and inspect their crops in person, but…

An exploration into the components of an effective CNN model: from data extraction and cleaning to transfer learning and architectural design to hyperparameter tuning and more…


As discussed in our initial blog post, our current goal is to create a CNN that will enable subsistence farmers in Sub-Saharan Africa to upload photos of their crops and find out whether or not their plants are healthy or diseased, and, if they are diseased, what they are stricken with. We left off having simply created a majority classifier model that always identified an image as being afflicted with Cassava Mosaic Disease (CMD)…

How the solution to low cassava crop yields due to disease may be rooted in novel deep learning techniques.

Cassava Crop Harvest [1]

Background and Motivation

One of Africa’s most crucial staple crops, the starchy cassava plant is the second-largest producer of carbohydrates on the entire continent, and, while this plant is known for its hearty nature and ability to withstand harsh environmental conditions, rampant disease outbreaks often threaten crop yields and pose a serious threat to the subsistence farmers who grow them. While over 80% of small, household farms in Sub-Saharan Africa grow this root, few have the ability to detect and mitigate the devastating effects…

Setting the Stage

You probably remember learning about the special patterns that occur in nature as a child. These might include cases of symmetry, spirals, tessellations, and fractals, amongst others. One of the most common of these patterns, however, is the Golden Ratio.

The Golden Ratio in Nature

The beautiful imagery evoked by this ratio is eye catching and very visually appealing as you can see above, but what if I told you there was an even simpler and more elegant law to which nearly all random collections of data in nature adhere? And what if I told you that random data sets weren’t quite so random after…

Griffin McCauley

Applied Mathematics Student at Brown University

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store