Current image registration algorithms relying on the internal parameters of sensing devices for image alignment face the difficulty of acquiring precise device parameters and reaching high mapping precision; while the ones using matched image features to calculate image homography matric for registration have the problem of insufficient utilization of scene depth information. Based on this observation, we propose a method which can generate more authentic image registration data from monocular images and their depth-maps, and use the data to train a pixel-wise image registration network, the PIR-Net, for fast, accurate and practical image registration. We construct a large-scale, multi-view, realistic image registration database with pixel-wise depth information that imitates real-world situations, the multi-view image registration (MVR) dataset. The MVR dataset contains 7 240 pairs of RGB images and their corresponding registraton labels. With the dataset, we train an encoder-decoder structure based, fully convolutional image registration network, the PIR-Net, extensive experiments on the MVR dataset demonstrate that the PIR-Net can predict pixel-wise image alignment matrix for multi-view RGB images without accessing the camera internal parameters, and that the PIR-Net out-performs traditional image registration methods. On the MVR dataset, the registration error of PIR-Net is only 18% of the general feature matching method (SIFT+RANSAC), and its time cost is 30% less.