Skip to main content

Table 4 Distribution of CVD and non-CVD cases in each cluster with different predetermined number of clusters in the testing set

From: Detecting cardiovascular diseases using unsupervised machine learning clustering based on electronic medical records

Predetermined No. clusters

Presence of CVD

Cluster 1

Cluster 2

Cluster 3

Cluster 4

Cluster 5

Cluster 6

Cluster 7

Cluster 8

Total

k = 2

Non-CVD

907

8191

      

9098

CVD

5213

1279

      

6492

Total

6120

9470

      

15,590

% of CVD

0.8518

0.1351

       

k = 4

Non-CVD

377

8105

616

0

    

9098

CVD

306

1169

5017

0

    

6492

Total

683

9274

5633

0

    

15,590

% of CVD

0.4480

0.1261

0.8906

0

     

k = 8

Non-CVD

5978

476

518

0

0

311

1815

0

9098

CVD

603

806

4381

3

1

213

485

0

6492

Total

6581

1282

4899

3

1

524

2300

0

15,590

% of CVD

0.0916

0.6287

0.8943

1.0000

1.0000

0.4065

0.2109

 Â