Identification of Triple Negative Breast Cancer Epidemiological Risk Factors for African American Women




Del Valle, Michelle
Calcagno, Alexa
Hayatshahi, Sayyed
Liu, Jin
Radler, Charlene
Morid, Mohammad


Journal Title

Journal ISSN

Volume Title



Purpose: Triple Negative Breast Cancer (TNBC) is a basal breast cancer subtype lacking progesterone receptor, estrogen receptor, and HER2 gene expression. TNBC has a poor prognosis with limited treatment options, and it occurs at higher rates in African American women compared to White American women. The purpose of this project was to classify race-dependent socioeconomic risk factors and to initiate a guideline for TNBC screening in African American women. Methods: We used the data from the Surveillance, Epidemiology, and End Results Program (SEER) of the National Cancer Institute. Our dataset included 35,976 female TNBC patients with 441 attributes documented in the United States. Top attributes for TNBC were specified for female African American patients through applying machine learning principles and optimization algorithms. Traditional statistical techniques were used to validate the key risk factors identified via machine learning. The top attributes identified represent socioeconomic, phenotypic, and epidemiologic factors. Results: The top factors related to African American TNBC patients determined in machine learning and descriptive statistics included marital status at diagnosis, age at diagnosis, and insurance status. Specifically, single status, lack of private insurance, and younger age at diagnosis corresponded to worse prognosis and increased mortality. Conclusions: Based on the top attributes identified, African American women would benefit from screening for TNBC beginning in their mid-late 40s, with particular attention given to single and uninsured women. Such screening methods might correspond to earlier diagnosis, prevention, and treatment and decrease rates of incidence and mortality of TNBC in African American women.