Python と画像処理の演習 2015-11-24 白井英俊ゼミ資料 1.はじめに---酢漬けの作り方 Lutz, M. (2009) 『初めての Python』. オライリー・ジャパン p.11 Python の標準モジュールである pickle を使うと、オブジェクトの永続化が簡単にできます。 Python オブジェクトをファイルなどに保存する、あるいは保存したオブジェクトを復元する、とい ったことを行うためのプログラムが簡単に書けるということです。 同 p.193 pickle によりオブジェクトをそのままファイルに保存する pickle モジュールを使用すれば、Python のあらゆるオブジェクトを、文字列への変換を行うこと なくファイルに直接保存できます。データのファイルへの保存、ファイルからの読み込みに幅広く 使える便利なツールと言えます。たとえば、ディクショナリをファイルに保存するには、pickle を使 用する場合は以下のようなコードを書きます。 >>> D = {'a': 1, 'b': 2} >>> F = open('datafile.txt', 'w') >>> import pickle >>> pickle.dump(D, F) # pickle を使用してオブジェクトをファイルに保存 >>> F.close() このディクショナリをファイルから取り出すには、以下のようなコードを書きます。 >>> F = open('datafile.txt') >>> E = pickle.load(F) # ファイルからオブジェクトを取り出す >>> E {'a': 1, 'b': 2} このようにしてディクショナリをファイルに保存した場合は、ファイルから取り出す際、先に示し たような分割、変換の処理は必要ありません。pickle モジュールを使用した場合には、自動的に シリアライズと呼ばれる処理が行われます。これは、オブジェクトからバイトストリーム、バイトストリ ームからオブジェクトへの変換処理です。この処理に際して、プログラマの側ですべきことはほと んどありません。ただ、バイトストリーム形式のデータをそのまま読み込んで表示しても、以下のと おり、人間にとってはほとんど意味のないものになっています(シリアライズのモードによっては、 もっと複雑でわけのわからない形式になります)。 >>> open('datafile.txt').read() "(dp0¥nS'a'¥np1¥nI1¥nsS'b'¥np2¥nI2¥ns." この形式から元のオブジェクトに戻す処理は、pickle が自動的に行うので、プログラマがそのた めのコードを書く必要はありません。pickle モジュールの詳細については、Python の標準のライ ブラリマニュアルを参照するか、対話型環境で pickle をインポートし、(pickle を引数として)help 関数を実行してみてください。また、同時に shelve モジュールについても調べてみるとよいでしょ う。shelve は基本的には pickle と同じですが、保存されたファイルが、ディクショナリのようにキー によってアクセスできる一種の「データベース」になる、という点が pickle とは異なります。 2. カメラからの画像取込み 桑井・豊沢・永田(2014)『実践 OpenCV2.4 for Python』カットシステム 2.5 節 カメラ画像の表示 ゼミ資料ページから 「OpenCV2.4 for Python のサンプルプログラム一式」をダウンロード DL/2_5/2_5.py 大事な箇所 import cv2 # OpenCV2 の使用 src = cv2.VideoCapture(0) # カメラからの画像取得の準備 retval, frame = src.read() # 1 フレーム取得 cv2.imshow(“Camer”, frame) # フレーム表示 key = cv2.waitKey(33) src.release() # これがないと実際には表示が行われない # ビデオファイルを閉じる(開いたら閉じること) 3. 顔の認識 桑井・豊沢・永田(2014)『実践 OpenCV2.4 for Python』カットシステム 5.6 物体検出(顔、眼、人物) ゼミ資料ページから 「OpenCV2.4 for Python のサンプルプログラム一式」をダウンロード DL/5_6/5_6.py 大事な箇所(2_5 に加えて) HAAR_FILE = 'haarcascade_frontalface_default.xml' # Haar-like 特徴量の一つ、正面顔検出 cascade = cv2.CascadeClassifier(HAAR_FILE) # フ Haar-like 特徴量ァイルの読み込み # 顔検出 (いろいろなサイズの『顔らしいもの』の座標などが返る) objects = cascade.detectMultiScale(frame, scaleFactor=1.1, minNeighbors=3, flags=cv2.CASCADE_SCALE_IMAGE, minSize=(0, 0)) for x, y, w, h in objects: # 検出数の繰り返し cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 4, cv2.CV_AA, 0) # 四角表示 cv2.imshow (windowName_1, frame) # 検出結果表示 Haar-Like 特徴は、画像の明暗差により特徴を捉えるので、画面の明るさに左右される---代替品として LBP 特徴 量がある。どちらも小さな特徴を組み合わせて目的の物体を認識する方式。なお近年注目されているのは、HOG 特徴量。 OpenCV では、Haar-like と LBP の顔検出用のための資料が提供されている。また、これは OpenCV のプログ ラムによって、自分で作り出すこともできる。それには、顔検出のための資料を作るとすれば、顔が写っている「正 解画像」と顔が写っていない「不正解画像」を大量に用意する必要がある。精度を考えると、正解画像は 7000 枚 以上、不正解画像は 3000 枚以上あると良いそうである。 表1.検出に関係する資料(他にもたくさんあるが主要なものだけ. OpenCV の data に格納されている) Type of cascade classifier XML filename Face detector (default) Face detector (fast Haar) Face detector (fast LBP) Profile (side-looking) face detector Eye detector (separate for left and right) Mouth detector Nose detector Whole person detector haarcascade_frontalface_default.xml haarcascade_frontalface_alt2.xml lbpcascade_frontalface.xml haarcascade_profileface.xml haarcascade_lefteye_2splits.xml haarcascade_mcs_mouth.xml haarcascade_mcs_nose.xml haarcascade_fullbody.xml detectMultiScale メソッドの引数の説明: Baggio, D. L. et al. (2012) Mastering OpenCV with Practical Computer Vision Projects. Packt. p.268 (ゼミ資料ページにあり) minFeatureSize: This parameter determines the minimum face size that we care about, typically 20 x 20 or 30 x 30 pixels but this depends on your use case and image size. If you are performing face detection on a webcam or martphone where the face will always be very close to the camera, you could enlarge this to 80 x 80 to have much faster detections, or if you want to detect far away faces, such as on a beach with friends, then leave this as 20 x 20. searchScaleFactor: The parameter determines how many different sizes of faces to look for; typically it would be 1.1 for good detection, or 1.2 for faster detection that does not find the face as often. minNeighbors: This parameter determines how sure the detector should be that it has detected a face, typically a value of 3 but you can set it higher if you want more reliable faces, even if many faces are not detected. flags: This parameter allows you to specify whether to look for all faces (default) or only look for the largest face (CASCADE_FIND_BIGGEST_OBJECT). If you only look for the largest face, it should run faster. There are several other parameters you can add to make the detection about one percent or two percent faster, such as CASCADE_DO_ROUGH_SEARCH or CASCADE_SCALE_IMAGE. 参考: http://opencv.blog.jp/python/anime_face_detect http://blog.adjust-work.com/212/ http://www.takunoko.com/blog/python で遊んでみる-part1-opencv で顔認識/ http://rest-term.com/archives/3131/ http://shkh.hatenablog.com/entry/2012/11/03/052251 4. 顔の識別までの道のり Howse, J. (2013) OpenCV Computer Vision with Python. Packt Chapter 4. (コードあり) Baggio, D. L. et al. (2012) Mastering OpenCV with Practical Computer Vision Projects. Packt. Chapter 8 (1) 顔の検出 --- 今までの話 Haar や LBP (OpenCV で用意されているものを使う), もしくは 特徴量の学習を行ッテできたものを使う (2) 顔の前処理 Face recognition is extremely vulnerable to changes in lighting conditions, face orientation, face expression, and so on, so it is very important to reduce these differences as much as possible. The easiest form of face preprocessing is just to apply histogram equalization using the equalizeHist() function, But for reliability in real-world conditions, we need many sophisticated techniques, including facial feature detection (for example, detecting eyes, nose, mouth and eyebrows). For simplicity, this chapter will just use eye detection and ignore other facial features such as the mouth and nose, which are less useful. Eye detectors that detect open or closed eyes are as follows: • haarcascade_mcs_lefteye.xml (and haarcascade_mcs_righteye.xml) • haarcascade_lefteye_2splits.xml (and haarcascade_righteye_2splits.xml) Eye detectors that detect open eyes only are as follows: • haarcascade_eye.xml • haarcascade_eye_tree_eyeglasses.xml 探索範囲(サイズに対する割合) Cascade Classifier EYE_SX EYE_SY EYE_SW EYE_SH haarcascade_eye.xml haarcascade_mcs_lefteye.xml haarcascade_lefteye_2splits.xml 0.26 0.19 0.17 0.30 0.40 0.37 0.28 0.36 0.36 LBP を使用して顔検出後の信頼性と検出速度(i7 2.2GHz) Cascade Classifier Reliability* Speed** Eyes found Glasses haarcascade_mcs_lefteye.xml 80% haarcascade_lefteye_2splits.xml 60% haarcascade_eye.xml 40% haarcascade_eye_tree_eyeglasses.xml Open or closed Open or closed Open only Open only no no no yes 0.16 0.10 0.12 15% 18 msec 7 msec 5 msec 10 msec 前処理 Geometrical transformation and cropping: This process would include scaling, rotating, and translating the images so that the eyes are aligned, followed by the removal of the forehead, chin, ears, and background from the face image. • Rotate the face so that the two eyes are horizontal. • Scale the face so that the distance between the two eyes is always the same. • Translate the face so that the eyes are always centered horizontally and at a desired height. • Crop the outer parts of the face, since we want to crop away the image background, hair, forehead, ears, and chin. Separate histogram equalization for left and right sides: This process standardizes the brightness and contrast on both the left- and right-hand sides of the face independently. Smoothing: This process reduces the image noise using a bilateral filter. Elliptical mask: The elliptical mask removes some remaining hair and background from the face image. (3) 顔画像を集め、そこから学習する Collecting faces can be just as simple as putting each newly preprocessed face into an array of preprocessed faces from the camera, as well as putting a label into an array (to specify which person the face was taken from). The face recognition algorithm will then learn how to distinguish between the faces of the different people. This is referred to as the training phase and the collected faces are referred to as the training set. It is important that you provide a good training set that covers the types of variations you expect to occur in your testing set. For example, if you will only test with faces that are looking perfectly straight ahead (such as ID photos), then you only need to provide training images with faces that are looking perfectly straight ahead. But if the person might be looking to the left or up, then you should make sure the training set will also include faces of that person doing this, otherwise the face recognition algorithm will have trouble recognizing them, as their face will appear quite different. One way to obtain a good training set that will cover many different real-world conditions is for each person to rotate their head from looking left, to up, to right, to down then looking directly straight. After you have collected enough faces for each person to recognize, you must train the system to learn the data using a machine-learning algorithm suited for face recognition. There are many different face recognition algorithms in literature, the simplest of which are Eigenfaces and Artificial Neural Networks. Eigenfaces tends to work better than ANNs, and despite its simplicity, it tends to work almost as well as many more complex face recognition algorithms, so it has become very popular as the basic face recognition algorithm for beginners as well as for new algorithms to be compared to. Any reader who wishes to work further on face recognition is recommended to read the theory behind: • Eigenfaces (also referred to as Principal Component Analysis (PCA) • Fisherfaces (also referred to as Linear Discriminant Analysis (LDA) • Other classic face recognition algorithms (many are available at http://www.face-rec.org/algorithms/) • Newer face recognition algorithms in recent Computer Vision research papers (such as CVPR and ICCV at http://www.cvpapers.com/), as there are hundreds of face recognition papers published each year Thanks to the OpenCV team and Philipp Wagner's libfacerec contribution, OpenCV v2.4.1 provided cv::Algorithm as a simple and generic method to perform face recognition using one of several different algorithms (even selectable at runtime) without necessarily understanding how they are implemented. Here are the three face recognition algorithms available in OpenCV v2.4.1: • FaceRecognizer.Eigenfaces: Eigenfaces, also referred to as PCA, first used by Turk and Pentland in 1991. • FaceRecognizer.Fisherfaces: Fisherfaces, also referred to as LDA, invented by Belhumeur, Hespanha and Kriegman in 1997. • FaceRecognizer.LBPH: Local Binary Pattern Histograms, invented by Ahonen, Hadid and Pietikäinen in 2004. These face recognition algorithms are available through the FaceRecognizer class in OpenCV. Both the Eigenfaces and Fisherfaces algorithms first calculate the average face that is the mathematical average of all the training images, so they can subtract the average image from each facial image to have better face recognition results. (4) 顔の認識 Thanks to OpenCV's FaceRecognizer class, we can identify the person in a photo simply by calling the FaceRecognizer::predict() function on a facial image. The problem with this identification is that it will always predict one of the given people, even if the input photo is of an unknown person or of a car. It would still tell you which person is the most likely person in that photo, so it can be difficult to trust the result! The solution is to obtain a confidence metric so we can judge how reliable the result is, and if it seems that the confidence is too low then we assume it is an unknown person. To confirm if the result of the prediction is reliable or whether it should be taken as an unknown person, we perform face verification (also referred to as face authentication), to obtain a confidence metric showing whether the single face image is similar to the claimed person (as opposed to face identification, which we just performed, comparing the single face image with many people). OpenCV's FaceRecognizer class can return a confidence metric when you call the predict() function but unfortunately the confidence metric is simply based on the distance in eigen-subspace, so it is not very reliable. これを乗り越える一つの方法:eigenspace から画像を復元、明るさ調整などした後、 We can now calculate how similar this reconstructed face is to the input face by using the same getSimilarity() function we created previously for comparing two images, where a value less than 0.3 implies that the two images are very similar. For Eigenfaces, there is one eigenvector for each face, so reconstruction tends to work well and therefore we can typically use a threshold of 0.5, but Fisherfaces has just one eigenvector for each person, so reconstruction will not work as well and therefore it needs a higher threshold, say 0.7.
© Copyright 2024 ExpyDoc