Multi-sensor integrated navigation technology has been applied to the indoor navigation and positioning of robots. For the problems of a low navigation accuracy and error accumulation, for mobile robots with a single sensor, an indoor mobile robot positioning method based on a visual and inertial sensor combination is presented in this paper. First, the visual sensor (Kinect) is used to obtain the color image and the depth image, and feature matching is performed by the improved scale-invariant feature transform (SIFT) algorithm. Then, the absolute orientation algorithm is used to calculate the rotation matrix and translation vector of a robot in two consecutive frames of images. An inertial measurement unit (IMU) has the advantages of high frequency updating and rapid, accurate positioning, and can compensate for the Kinect speed and lack of precision. Three-dimensional data, such as acceleration, angular velocity, magnetic field strength, and temperature data, can be obtained in real-time with an IMU. The data obtained by the visual sensor is loosely combined with that obtained by the IMU, that is, the differences in the positions and attitudes of the two sensor outputs are optimally combined by the adaptive fade-out extended Kalman filter to estimate the errors. Finally, several experiments show that this method can significantly improve the accuracy of the indoor positioning of the mobile robots based on the visual and inertial sensors.