Augmented Reality Using Pose Estimation on the Android Mobile Platform Joshua French

advertisement

Augmented Reality

Using Pose Estimation on the Android Mobile Platform

Joshua French

What is Augmented Reality (AR)?

A live view of a real, physical environment whose elements are augmented by computer-generated sensory input

Sensory input could be graphics, video, text, audio…

Combined with sensors and ideas from computer vision allows augmentation to become interactive

Devices include head-mounted displays, projectors, glasses, televisions, handheld mobile devices

Android

Open-source, Linux-based OS for mobile devices such as tablets and cell phones

Most devices have many sensors such as cameras, touch screens, accelerometers, GPS, and others for adding interactivity to an AR application

Project –

Rendering a Virtual Object In a Scene

Pose Estimation - Two methods:

 Positioning sensors (GPS, accelerometers, magnetic/gravitational sensors, …)

 2D-3D correspondences

My Project:

 Camera and 2D-3D correspondences to pose approach

Pose Estimation Using Correspondences

 2 main steps to the algorithms:

 Determine 2D-3D correspondences

 Calculate pose from correspondences

2D-3D Correspondences

 Fiducials – Objects placed in a field of view used as a point of reference:

 Markers

 CCCs

 QR codes

 AprilTags

ARToolkit markers

 Environment features

Example of Detecting Fiducial Marker:

AprilTag

 Procedure:

 1) Detect line segments

 2) Find 4 corners of a quad formed by four sequential line segments

 3) Compute tag pose using four-point homography estimation

 4) Decode tag

1 – Computes gradient magnitude and direction at each pixel. Pixels with similar gradients are clustered into components. Line segments are then fit to the components using least squares.

2 – DFS of depth four is used to find quads obeying CCW winding order

4 – Using the calculated homography, tag-relative coordinates of each bit field are transformed into image coordinates and the resulting pixels are thresholded to determine if the bit is black or white

Factors affecting accuracy

 Environmental illumination

 Tag non-planarity (bending)

 Lens distortion

 Angle of tag

 Distance to tag

Calculating Pose

Least Squares

Direct Linear Transform (DLT)

PosIt

Nestor

SLAM/CLAM

Least Squares

Find x (pose) to minimize error (E = |f(x) – y0| 2 )

Algorithm:

 1) Guess pose (x = x0)

 2) Compute predicted image points (y = f(x); Residual error is dy = y – y0)

 3) Calculate Jacobian (J = [df / dx])

 4) With dy = Jdx, solve for dx using pseudo inverse (dx =

(J T J) -1 J T dy

 5) Set x <= x + dx

 6) Repeat steps 2-4 until convergence

Direct Linear Transform

PosIt

 Two part algorithm:

 POS (Pose from Orthography and Scaling)

 Approximates perspective projection and finds rotation matrix and translation vector

 IT (Iteratively)

 Iteratively uses a scale factor on each point to enhance orthographic projection until a threshold is met

POS

PosIt

AndAR https://code.google.com/p/andar/

Download