Robotic apple harvesting has received much research attention in the past few years due to growing shortage and rising cost in labor. One key enabling technology towards automated harvesting is accurate and robust apple detection, which poses great challenges as a result of the complex orchard environment that involves varying lighting conditions and foliage/branch occlusions. This letter reports on the development of a novel deep learning-based apple detection framework named Suppression Mask R-CNN. Specifically, we first collect a comprehensive apple orchard dataset for “Gala” and “Blondee” apples, using a color camera, under different lighting conditions (overcast and front lighting vs. back lighting). We then develop a novel suppression Mask R-CNN for apple detection, in which a suppression branch is added to the standard Mask R-CNN to suppress non-apple features generated by the original network. Comprehensive evaluations are performed, which show that the developed suppression Mask R-CNN network outperforms state-of-the-art models with a higher F1-score of 0.905 and a detection time of 0.25 second per frame on a standard desktop computer.