Tip #5: My Starting-Point CNN Model

Go to the beginning of the article

Read the previous tip: Follow Tested Practices

Assuming you are working with non-image data (and therefore transfer learning is not an option) and there is a good reason to use CNN-based models (i.e., there is some continuity between neighbors in at least one dimension of the data; see this post for further explanation), I would start with a CNN architecture with five hidden layers: a first convolutional layer, a first max pooling layer, a second convolutional layer, a second max pooling layer, and a flattening layer. I would fix the minibatch size as 32, the activation function as Relu, and the optimizer as Adam. These are well-tested parameters that typically perform well without much adjustment.

Initial filter sizes should be determined based on the characteristics of the data (see this post for more information), and the number of filters or neurons should depend on the number of training samples. As a general rule of thumb, if you have around 400 training samples (with or without augmentation), 30-40 filters/neurons per convolutional layer would be appropriate. If you have around 800 training samples, you could use around 60 filters/neurons per convolutional layer. If you have around 2000 training samples, it might be a good idea to add a third convolutional layer and a third max pooling layer, with 40-60 filters/neurons per convolutional layer.

On the other hand, if you only have around 150-200 training samples, you should probably remove the second convolutional and second max pooling layers, and be aware that you are approaching the limits of what CNN (or any DNN)-based models can handle. If your training set is smaller than 150, you should consider whether it is possible to make data augmentation work (see this post) and whether it might be better to switch to an SVM model.

About the Author: Justin Li earned his Ph.D. in Neurobiology from the University of Wisconsin–Madison and an M.S. in Computer Science from the University of Houston, following a B.S. in Biophysics. He served as an Assistant Professor at the University of Minnesota Medical School (2004–2009) and as Chief Bioinformatics Officer at LC Sciences (2009–2013) before joining AccuraScience as Lead Bioinformatician in 2013. Justin has published around 50 research papers and led the development of 12 bioinformatics databases and tools - including miRecords, siRecords, and PepCyber - while securing over $3.4M in research funding between 2004 and 2009 as PI, co-PI, or co-I. He has worked on NGS data analysis since 2007, with broad expertise in genome assembly, RNA-seq, scRNA-seq, scATAC-seq, Multiome, ChIP-seq and epigenomics, metagenomics, and long-read technologies. His recent work includes machine learning applications in genomics, AlphaFold modeling, structural bioinformatics, spatial transcriptomics, immune repertoire analysis, and multi-omics integration.

Need assistance in your AI/deep learning project? We may be able to help. Take a look at the intro to our bioinformatician team, see some of the advantages of using our team's help here, and check out our FAQ page!

Send us an inquiry, chat with us online (during our business hours 9-5 Mon-Fri U.S. Central Time), or reach us in other ways!

Chat Support Software