What is CPU
A CPU (Central Processing Unit) is like the brain of a computer, processing software and hardware such as a mouse and keyboard to make them work as expected. The CPU has the following standard components
- Computing unit
Performs arithmetic and logic operations on data stored in main memory. - Control unit
Reads programs stored in main memory and executes various processes by decoding instructions with a decoder. - Clock
Indicator of the speed of signals emitted by the CPU when processing - Registers
Storage memory area within the CPU
These components work together to create an environment that enables high-speed task parallel processing: as the CPU clock is driven, the CPU switches between hundreds of different tasks per second at high speed. This allows for simultaneous operations such as displaying the desktop or connecting to the Internet.
CPUs also have the concept of "cores". Nowadays, as clock speeds reach their limits, the CPUs mentioned above are referred to as "cores," and it has become mainstream to have multiple cores to perform multiple processes in parallel. Multiple cores are like having multiple CPUs built into a single CPU.
What is GPU
A GPU (Graphics Processing Unit) is a specialized processor with enhanced mathematical computing power, suitable for tasks such as computer graphics and machine learning. For example, image depiction requires complex mathematics to render and parallel computations for those complex mathematics to work correctly. CPUs are not built to handle such loads. GPUs, like CPUs, have cores and memory, but GPUs specialize in parallel data processing with many cores and are used in the graphics domain. GPUs also have better parallel processing performance than CPUs and are better at matrix operations, which is why they are often used in the deep learning domain.
Differences between CPUs and GPUs
The following are the differences between CPUs and GPUs in terms of the number of cores and areas of expertise.
CPU | GPU | |
---|---|---|
Core count | Several | Thousands |
Areas of expertise | Sequential and complex computational processing | Parallel processing |
Core count
A core is where the computational processing of a computer takes place, and the more cores, the greater the amount of processing that can be done at once.
Areas of Expertise
CPUs are good at processing continuous and complex tasks. For example, CPUs are good at performing operations based on program instructions and executing those instructions while controlling the memory, display, mouse, keyboard, etc. CPUs are good at processing sequential, complex tasks. Such instructions must be processed in sequence, and more cores do not necessarily mean faster processing. In contrast, GPUs are good at processing simple tasks. For example, image processing such as 3D graphics requires a large number of simple calculations, making GPUs suitable for this task.
Performance Comparison
The following is a code that a neural network learns 1563 images. We will run this code on CPU and GPU in Google Colab environment and compare the performance.
import numpy as np
import matplotlib.pyplot as plt
import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import Adam
(x_train, t_train), (x_test, t_test) = cifar10.load_data()
batch_size = 32
epochs = 1
n_class = 10
t_train = keras.utils.to_categorical(t_train, n_class)
t_test = keras.utils.to_categorical(t_test, n_class)
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(n_class))
model.add(Activation('softmax'))
model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
x_train = x_train / 255
x_test = x_test / 255
model.fit(x_train, t_train, epochs=epochs, batch_size=batch_size, validation_data=(x_test, t_test))
The first step is to run it on the CPU, which yields a result of 255 seconds.
1563/1563 [==============================] - 255s 163ms/step - loss: 1.4949 - accuracy: 0.4569 - val_loss: 1.0942 - val_accuracy: 0.6151
Next, we run it on the GPU. The result was 9 seconds.
1563/1563 [==============================] - 9s 5ms/step - loss: 1.5829 - accuracy: 0.4223 - val_loss: 1.2229 - val_accuracy: 0.5606
Thus, the choice of GPUs can greatly reduce learning time.
References