C++ and OpenCV: InputArrayOfArrays Type and reconstruct() function in SfM Module
I am experimenting with the SfM Module in opencv_contrib
, and I was reading the documentation for the reconstruct() function. I'm a bit confused as to one of the parameters that I can pass to it.
One of the arguments is:
InputArrayOfArrays points2d
and that's defined as:
points2d Input vector of vectors of 2d points (the inner vector is per image).
Does that mean we flatten the Mat data type of each image into a vector of vectors or something? I don't get it.
Do I need to use a feature matcher and find matching features, then put those features in this vector?
Either I'm not understanding what it means properly, or the documentation is a bit confusing. I'm rather new to OpenCV in C++.
Thanks in advance!
See also questions close to this topic

WinAPI  Loading Ressources from a DLL
I'm developing an applicaion for Windows 7 with VisualStudio2017. This application wants to have special cursors which are loaded from a DLL. So first I created a DLL and added the following .rc file:
BM_CURSOR_GRAB CURSOR "./grab.cur" BM_CURSOR_GRABBING CURSOR "./grabbing.cur"
BM_CURSOR_GRAB and BM_CURSOR_GRABBING are defined in a header file as:
#define BM_CURSOR_GRAB 100 #define BM_CURSOR_GRABBING 101
I compile the DLL  that works and check it with
ResourceEditor.exe
My ressources are included: Picture from the Resource EditorNow the "non working" part starts. My application wants to load the cursor, but
FindResource
doesn't find it. Here is my code:HMODULE dll = LoadLibrary("BenjaMiniRessources.dll"); HRSRC hRes = FindResource(dll, MAKEINTRESOURCE(100), RT_CURSOR); DWORD dwSize = SizeofResource(dll,hRes); HGLOBAL hMem = LoadResource(dll, hRes); LPBYTE pBytes = (LPBYTE)LockResource(hMem); Cursor = CreateIconFromResource(pBytes, dwSize, false, 0x00030000);
What am I doing wrong?

C++ out of range error of using vector
I'm doing Heun method to solve differential equation f`(x)=x and f(0)=1 for t
but the code does not work at all
here are codes.
#include <iostream> #include <vector> #include <fstream> using namespace std; double func(double x) { return x; } int main(void) { float static x0 = 1; int static T = 10; int static N = 100; float static dt = T / N; vector<vector<double> > sol1(N, vector<double>(1,0)); sol1[0][0] = 0; sol1[0][1] = x0; double x = x0; double k1, k2; for (int j=0; j < N+1; j++) { sol1[j][0] = j * dt; k1 = func(x); k2 = func(x + dt * k1); x += dt * (k1 + k2) / 2; sol1[j][1] = x; } ofstream outFile("output.txt"); for (int j = 0; j < N; j++) { for (int i = 0; i < 1; j++) { outFile << sol1[j][i] << " "; } outFile << endl; } outFile.close(); sol1.clear(); cin >> N; exit(0); }
and the result say vector subscript is out of range
as there are no compile error message in VS2017
I`m confused to solve the error.
Is there any miss in code?

Lambda expression for operator overloading
Is it possible to write lambda expression for overloading operators?
For example, I have the following structure:
struct X{ int value; //(I can't modify this structure) };
X
needs==
operatorint main() { X a = { 123 }; X b = { 123 }; //[define equality operator for X inside main function] //if(a == b) {} return 0; }
==
operator can be defined asbool operator==(const X& lhs, const X& rhs){...}
, but this requires adding a separate function, and my comparison is valid only within a specific function.auto compare = [](const X& lhs, const X& rhs){...}
will solve the problem. I was wondering if I can write this lambda as an operator. 
Multiple line detection in HoughLinesP openCV function
I am new to Python and OpenCV. I am trying to detect single line with HoughLinesP function with code from the internet, 34 lines are detected. I tried with maxLineGap variable but not helpful.
Input image: https://imgur.com/a/oPYde6r Output Image: https://imgur.com/a/AGt9A6j
import sys import math import cv2 as cv import numpy as np
def main(argv):
default_file = "line.png" filename = argv[0] if len(argv) > 0 else default_file # Loads an image src = cv.imread(filename, cv.IMREAD_GRAYSCALE) # Check if image is loaded fine if src is None: print ('Error opening image!') print ('Usage: hough_lines.py [image_name  default ' + default_file + '] \n') return 1 dst = cv.Canny(src, 50, 200, None, 3) # Copy edges to the images that will display the results in BGR cdst = cv.cvtColor(dst, cv.COLOR_GRAY2BGR) cdstP = np.copy(cdst) lines = cv.HoughLines(dst, 1, np.pi / 180, 150, None, 0, 0) if lines is not None: for i in range(0, len(lines)): rho = lines[i][0][0] theta = lines[i][0][1] a = math.cos(theta) b = math.sin(theta) x0 = a * rho y0 = b * rho pt1 = (int(x0 + 1000*(b)), int(y0 + 1000*(a))) pt2 = (int(x0  1000*(b)), int(y0  1000*(a))) cv.line(cdst, pt1, pt2, (0,0,255), 3, cv.LINE_AA) linesP = cv.HoughLinesP(dst, 1, np.pi / 180, 50, None, 50, 150) no_of_Lines = 0 if linesP is not None: for i in range(0, len(linesP)): l = linesP[i][0] no_of_Lines = no_of_Lines + 1 cv.line(cdstP, (l[0], l[1]), (l[2], l[3]), (0,0,255), 3, cv.LINE_AA) print('Number of lines:' + str(no_of_Lines)) cv.imshow("Source", src) cv.imshow("Detected Lines (in red)  Standard Hough Line Transform", cdst) cv.imshow("Detected Lines (in red)  Probabilistic Line Transform", cdstP) cv.waitKey() return 0
if name == "main": main(sys.argv[1:])

Python OpenCV Error: cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) (215) scn == 3  scn == 4 in function cv::cvtColor
I am getting an error in my python script. It wont look in the second folder no matter what I do... Here is my code...
import cv2 import glob import random import math import numpy as np import dlib import itertools from sklearn.svm import SVC import pickle emotions = ["anger", "disgust", "fear", "joy", "neutral", "sadness", "surprise"] #Emotion list clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8)) detector = dlib.get_frontal_face_detector() predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat") #Or set this to whatever you named the downloaded file clf = SVC(kernel='linear', probability=True, tol=1e3)#, verbose = True) #Set the classifier as a support vector machines with polynomial kernel data = {} #Make dictionary for all values def get_files(emotion): #Define function to get file list, randomly shuffle it and split 80/20 files = glob.glob("dset1\%s\*" %emotion) #change dataset directory address here! random.shuffle(files) training = files[:int(len(files)*0.8)] #get first 80% of file list prediction = files[int(len(files)*0.2):] #get last 20% of file list return training, prediction def get_landmarks(image): detections = detector(image, 1) for k,d in enumerate(detections): #For all detected face instances individually shape = predictor(image, d) #Draw Facial Landmarks with the predictor class xlist = [] ylist = [] for i in range(1,68): #Store X and Y coordinates in two lists xlist.append(float(shape.part(i).x)) ylist.append(float(shape.part(i).y)) #record mean values of both X Y coordinates xmean = np.mean(xlist) ymean = np.mean(ylist) #store central deviance xcentral = [(xxmean) for x in xlist] ycentral = [(yymean) for y in ylist] landmarks_vectorised = [] for x, y, w, z in zip(xcentral, ycentral, xlist, ylist):#analysing presence of facial landmarks landmarks_vectorised.append(w) landmarks_vectorised.append(z) #extract center of gravity with mean of axis meannp = np.asarray((ymean,xmean)) coornp = np.asarray((z,w)) #measuring distance and angle of each landmark from center of gravity dist = np.linalg.norm(coornpmeannp) landmarks_vectorised.append(dist) landmarks_vectorised.append((math.atan2(y, x)*360)/(2*math.pi)) data['landmarks_vectorised'] = landmarks_vectorised#store landmarks in global dictionary if len(detections) < 1: #if no landmarks were detected, store error in dictionary data['landmarks_vestorised'] = "error" def make_sets(): training_data = [] training_labels = [] prediction_data = [] prediction_labels = [] for emotion in emotions: #train for each emotion print(" working on %s" %emotion) training, prediction = get_files(emotion) #obtain the dataset #Append data to training and prediction list, and generate labels 07 for item in training: image = cv2.imread(item) #open image gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #convert to grayscale clahe_image = clahe.apply(gray)#apply local histogram equalization get_landmarks(clahe_image) #extract landmarks if data['landmarks_vectorised'] == "error": print("no face detected on this one") else: training_data.append(data['landmarks_vectorised']) #append image array to training data list training_labels.append(emotions.index(emotion)) #do the same for test dataset as above for item in prediction: image = cv2.imread(item) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) clahe_image = clahe.apply(gray) get_landmarks(clahe_image) if data['landmarks_vectorised'] == "error": print("no face detected on this one") else: prediction_data.append(data['landmarks_vectorised']) prediction_labels.append(emotions.index(emotion)) return training_data, training_labels, prediction_data, prediction_labels accur_lin = [] for i in range(0,7): #set nunmber of traning iterations here print("Making sets %s" %i) #Make sets by random sampling 80/20% training_data, training_labels, prediction_data, prediction_labels = make_sets() npar_train = np.array(training_data) #Turn the training set into a numpy array for the classifier print(training_labels) npar_trainlabs = np.array(training_labels) print("training SVM linear %s" %i) #train SVM clf.fit(npar_train, training_labels) print("getting accuracies %s" %i) #Use score() function to get accuracy npar_pred = np.array(prediction_data) pred_lin = clf.score(npar_pred, prediction_labels) print("linear: ", pred_lin) accur_lin.append(pred_lin) #Store accuracy in a list # Create an variable to pickle and open it in write mode pkl_filename = "pickle_model.pkl" with open(pkl_filename, 'wb') as file: pickle.dump(clf, file) #writes model to a pickle file. print("Mean value lin svm: %s" %np.mean(accur_lin)) #FGet mean accuracy of the 10 runs
Interestingly enough if I change the order of the array so it tries to look for 'disgust' first it will find and process this fine, twice when i have repeatedly run the script(without changing anything) it will find the second folder but fail on the third.
Here is a screenshot of the error...
Normal Order
Here is a screenshot of the array where it should process anger twice
emotions = ["anger", "anger", "fear", "joy", "neutral", "sadness", "surprise"] #Emotion list
Anger twice, still wont detect
It seems that its something to do with the python rather than the code, any ideas
 How to remove rectangle shapes from image, keeping text, in Python3?

How to calculate camera movement speed between to frames?
I want to know how much my camera moved (expressed in speed, e.g. m/s) between two images given timestamp for each image.
I know that I can get the pose of my camera while taking the second image relative to the pose of my camera while taking the first image.
My motivation come from the MathWorks Structure from Motion documentation.
But now I am stuck.
 How do I calculate the speed if I have the camera position and orientation at t0 and the camera position and orientation at t1? How to mix linear and angular speed, meaning translation and rotation?
 Also, I guess, I am lacking a measurement unit (like meter) relative to the real word. I can probably only calculate the speed in relation to the camera positions but not the real world. I would need an object, such as a marker, where the size is known. Then, my speed assessment would only work as long as I have an object with known size in both of my images, right?

Comparison of Estimated Camera Pose by SFM with Ground Truth data
I am studying the effect of smoothing an image before passing it to SFM pipeline. The SFM pipeline provides me the camera intrinsic and extrinsic matrix, and as ground truth, I have the intrinsic and extrinsic matrix of each camera.
Further, I estimate the camera position in world coordinate as C = (Transpose(R))*t, where R is the 3x3 rotation matrix and t is the 3x1 translation vector.
Now, I am confused as to how I can find the error in the estimated position compared to ground truth, like a euclidean distance between the estimated camera position and ground truth camera position.
I am confused because the estimated camera pose can be in different coordinate systems and hence will give me inaccurate error.

2D Image to 3D world Coordinates
We crawled a set of images from the Google Street View (GSV) API. I want to estimate the 3D World coordinate from 2D Image given the following:
1. The GPS location (i.e., latitude and longitude) of the camera capturing the image
Conversion of GPS coordinates to translation matrix: Used 2 types of conversion methods to get the translation matrix > UTM conversion and conversion to Cartesian coordinates.
 UTM conversion: Used Python's UTM library to convert GPS coordinates to UTM coordinates. Used the north and east values with a fixed height to create the translation matrix.
 Cartesian conversion  Used the following formula to generate translation matrix:
x = Radius*math.cos(latitude)*math.cos(longitude)
y = Radius*math.cos(latitude)*math.sin(longitude)
z = Radius*math.sin(latitude)
2. The rotation matrix calculated using openSFM (i.e., the SFM algorithm).
The library provides alpha, beta, gamma angles (in Radian) which map to yaw, pitch, and roll angles, respectively. The rotation matrix is constructed using the formula (http://planning.cs.uiuc.edu/node102.html)
Rotation Matrix (R): R(alpha, beta, gamma)= R_z (alpha) * R_y (beta) * R_x (gamma)
3. Based on the angle of field of view and the dimensions of the image, we estimate the calibration matrix as the following (https://codeyarns.com/2015/09/08/howtocomputeintrinsiccameramatrixforacamera/enter link description here):
K = [[f_x s X], [0 f_y Y], [0 0 1]]
x and y are half of the image dimensions (i.e., x = width/2 and y = height/2)
The GSV API provides field of view angle θ (e.g., 45 or 80) so the focal length can be calculated as
f_x= x/tan(θ/2)
f_y= y/tan(θ/2)
Using the matrices T, R, and K, how can we estimate the 3D World coordinates of each pixel in the 2D image?