Thoughts - HackMD

# Thoughts The auto correlation function indicates how much the samples depend on one another. When the function is 0 for a large number of values then it implies that most of the samples are uncorrelated to each other , this also means that the process is highly random. If the values of the auto correlation function are quite high then it means that the random process is not very random as samples depend on other samples quite a lot. Now let us consider a special case r(k)=Ndirac(k), here it is non zero only for k=0 indicating that all the samples are uncorrelated to the other samples and this process is highly random. But the expected power of each sample is non zero, it is N r(0). When the process is highly random it indicates that the values of the samples change drastically on average. So when any arbitrary realisation of this process is picked, we would observe highly fluctuating graph which indicates high frequency of the process. So when we find the power spectral density then all the frequencies have non zero values. PSD would be N. --- For describing the LTI system output y(n)=sum of h(k)x(n-k) is more preferred than y(n)=sum of h(n-k)x(k) Both mean the same thing but intuitively they mean different things. 1st means that we are adding a linear combination of delayed input signals to get out output signal 2nd means that we are adding a linear combination of input values to get the output signal --- Doubt: what is the difference between rxy(k) and ryx(k) intuitively --- When we pass a random signal through an LTI system we obviously get a random signal as the output. But the properties of the output signal are somehow related to the properties of the input signal. The properties that we would like to observe our 1. 1st moment 2. 2nd moment 3. power spectral density It can be proven mathematically that if the input signal is 2nd order stationary then the output signal is also 2nd order stationary and the power spectral density is the (PSD of input)(freq response of system)^2. Now we can treat this system as a low pass or high pass filter and filter out frequencies from the input. But remember that this is only the average power. So the power of many realisations at a particular frequnecy may be higher or lower than that of the PSD at that frequency , but when we average the power at that frequency of all the realisations then it would be exactly the same as that of the PSD value at that frequency. If my system is a band pass filter then the average power of all the frequencies outside this band becomes 0 so does this all the realisations would not have that frequency component in it? --- A non trivial proof of why the PSD is always non negative for real valued signals. First of all for real valued signals the PSD is symmetric about the y axis. try proving it later. We will construct an LTI system such that the frequency response of the system is two impulses at w0 and -w0. PSD of the output is (H(w)^2)(PSD of input). Also the PSD of the output integrated from -pi to pi is ryy(0) which we know is greater than 0 always. So upon integration we get 2(PSD_x(w0))>=0. Hence the PSD of the signal is always non negative. Even though there was a lot of assumption involved in this proof, nowhere I had assumed any knowledge about PSD_x(w0) except that it is symmetric. I am wondering if this proof would work if I had constructed my filter with only 1 impulse at w0 --- The vector Ax is a linear combination of the columns of the matrix A where its coefficients are the components of the vector x. But instead if we wanted the information about each weighted column(x_ia_i). Then we would have to multiply A with diag(x) rather than x. How to prove that there exists atleast 2 eigenvectors corresponding to same eigenvalue(which is repeated twice) of a Hermitian matrix which are orthogonal to each other All of these mean the same thing 1. Determinant of A is 0 2. A is not invertible 3. A has linear dependent columns 4. Atleast one of the eigenvalues of A is 0 There are 2 ways of proving that the product of the eigen values is the determinant of the matrix 1. det(A-\pI)=(-1)^n(p-p1)(p-p2)...(p-pn) where A has n eigenvalues Now put p=0 we get det(A)=p1.p2...pn 2. We know that a matrix with n independent eigenvectors can be decomposed as A=EDE^-1. Taking determinant on both sides we get det(A)=det(D) where det(D) is nothing but the product of eigenvalues. Lets take an arbitrary matrix A, it is neither symmetric nor anti symmetric. But remember every matrix can be expressed as a sum of a symmetric matrix and an anti symmetric matrix. Now we want to observe xTAx. Lets say A=B+C where B is symmetric and C is anti symmetric then xT(B+C)x=xTBx +xTCx. Now on futher observation we notice that xTCx is actually 0,all terms cancel out. so xTAx=xTBx. so if B is positive semi definite then A is also positive semi definite. So for a matrix to be positive semi definite it is not necessary for it to be symmetric or hermitian. The main role of linear algebra is to help us solve multiple linear equations that pop up in real life. They help us answer the following questions 1. Does a solution exist which satisfies all the equations simultaneously 2. If multiple solutions exist then how do we interpret them 3. If no solution exists then can we find an approximate solution 4. If it does exist then how do we compute it. Now we want to anylse the relation between multiple random processes. A real life example is consider the product price of 2 companies, one is a oil processing company and the other is a paint company. The paint company requires oil products to manufacture its paints.We can model the price of each company’s product as a random process and observe the correlation between the prices of both the company’s product. Now at one point of time the oil company increases it prices which leads in the increase of the paint price as the expenditure of the paint company has increased. We can observe that the correlation between the random processes is positive(if the oil company reduces its price then the paint company also reduces it price). This correlation function is not only a mathematical concept but can also provide us information about the company’s decisions. For example if at one point even if the oil company’s price is increasing but the paint company’s price stays stagnant or even decreases(mathematically the correlation between the prices would either become 0 or negative) then we can guess that the paint company has struck a deal with some other oil company to fulfil its requirements and we can find out which oil company by finding the autocorrelation functions between all possible oil companies and this paint company. Whichever oil company shows the highest correlation with our paint comapny is most likely the supplier to our paint company. When modelling real life situations just 2 random processes are not enough, as the price of the oil company may depend on some other random process altogether. Signals and Systems by Alan V Oppenheim --- The Fourier transform is useful in helping us find the frequency response of the system. But since it doesnt converge always, we have to resort to Laplace and z transform. The main reason we switch to Laplace and z transform is to find the values of frequency and damping factors where the frequency response does exist. But drawing a 3d picture of the transforms is not feasible, but plotting the location of poles and zeros on the respective s plane and z plane is helpful. Here is how. After we have found out our Laplace and z transform, we plot the poles and zeros and simply put s=jw or z=$e^{jw}$ and take its magnitude. Things get really simplified if the transform function is a ratio of 2 polynomials, because each polynomial can be expressed as product of (x-x_i) where x_i are the roots. On taking mod of them we get that |s-s_i| or |z-z_i| is nothing but the distance between 2 complex numbers. So the frequency response is $\frac{\Pi(distance of w from zero)}{\Pi(distance of w from pole)}$. Laplace transform is just the Fourier transform of $x(t)e^{-\sigma t}$. The reason we multiply our function with that extra exponential term, is because a Fourier Transform doesnt exist for non convergable integrations, so we want to nullify those growing functions by multiplying it with a decaying function so that the sum becomes converging. If we look at the z transform, its the same case: In DTFT we try to express the signal in form of $e^{jw}$ but the signal might not be converging, so we multiply the function with $r^{-n}$ and if we put r>1 then we can force the time domain function to become converging. That is the reason the region of convergence in Laplace Transform depends on the range of real values that $\sigma$ can take rather than the ROC depending on the range of the imaginary values that the w can take. Similarly for the z transform, it is the radius r that determines whether the transform is convergent or not so the ROC of z transform is defined in circles, as the imaginary part can take any values(or rather angles). Linear constant coefficient differential equations or difference equations are very important generalisation of common systems. Since their study is important, we analyse their laplace and z transform very keenly. Turns out for such type of systems the system function in the Laplace and z transform turns out to be a rational polynomial function , meaning that the system function can be expressed as a ratio of polynomial by a polynomial. Also another advantage of this rational system function is that, finding its inverse is also very easy as the expression can be written as a sum of partial fractions whose inverse is already known. One more important point about Laplace Transform and z transform is that their ROC cant be a union of disjoint strips(in the case of Laplace Transform) or rings(in the case of Z transform). Lets take an example of an exponentially growing function, now we find a sigma such that it is able to make that function's envelope constant, now obviously for all sigma less than this sigma the function is only going to decay, so it is integrable and hence the ROC must include all sigma's below it. Same goes for the z transform. Just by having the transform function one cant find out which time domain signal it corresponds to. For example if the Laplace transform is 1/(s+a) then it could either mean $e^{-at}u(t)$ or $-e^{-at}u(-t)$. But if I had known my ROC then I would have been able to figure out which time domain function was being referred to here. So for a unique inverse of a transform to exist we must know the transform function and the ROC. Similar arguments hold true for z transform also. The Linear constant coefficient differential equation gives us only the algebraic expression of the system function/transfer function but the properties of the system like stability and causality tell us about the ROC of the system, and to uniquely identify the time domain signal from its transform function both the transform function and the ROC is needed. Important points to be remembered: 1. A causal system has its impulse response as a right sided function where its value is zero for negative time. This implies that my exponential factor $e^{-\sigma t}$ must decay the right side of the signal, so sigma must be greater than some $\sigma_{0}$, this corresponds to a ROC which lies to the right of the rightmost pole. The same argument holds true for the z transform, for the system to be causal, its ROC must be outside the outermost circle and include inf(no pole at inf) 2. If my ROC includes the imaginary axis, it means that the Fourier transform exists and hence the integration of the absolute value of the function exists which is the condition for stability. Similarly for the z transform, if the ROC contains the unit circle(it is the Fourier Transform) then the Foruier Transform exists and for the Fourier transform to exist the sum of the absolute sequence must be finite, which is the condition for stability. The notational confusion about what is correct X(jw) or X(w) is finally resolved. So before the Laplace and z transform were invented The Fourier Transforms were represented as X(w). Now when Laplace came he used the same notation but only with a change of variable X(s) which became confusing now. So we know that X(s) when s=jw equals X(w). So instead of X(w) we will write X(jw) and this would not mean the Fourier Transform but rather it would mean the Laplace transform along the imaginary axis. When the z transform came it was defined as X(z) where z=$re^{jw}$.The z transform gives us the Fourier Transform if the r=1. since X(z) when z=e$^{jw}$ = X(w), lets write X($e^{jw}$) to denote the z transform along the unit circle rather than Fourier Transform of a discrete time signal. Based on the ROC we can draw up some useful conclusions about the time domain signal 1. If sigma>constant or r>constant then its right sided as we want $e^{\sigma t}$ or $r^{-n}$ to decay as much as possible. 2. The opposite argument for left sided signals 3. if ROC is a strip(laplace transform) or if its a ring(z transform) then it is a 2 sided signal Linear operators --- Examples include derivation, integration, expectation etc. For an operator to be linear the output of the linear combination of inputs must be a linear combination of the individual outputs. A useful property of linear operators is that if they have to be applied in succession then the order in which they are applied does not matter. E($\int(\frac{d}{dx})$)can be written in any order. An example of non linear operator is the modulus operator. For inputs a and b |a+b| $\neq$ |a|+|b|. So the order of operations in this following case cant be changed. $\int ||$. Digital Signal Processing by Alan Oppenheim --- The maximum possible frequency in the digital domain is pi because: The most fluctuating signal would have a negative value after a positive value and this would keep repeating.So the time period is 2 samples, hence the frequency is 1/2 which implies w=2*pi*0.5=pi. The main aim during signal processing is to minimise the amount of information required to perform the conversion. This leads us to some few properties of the linear algebra, where a simple nxn matrix encompasses all the information required to transform the n dimensional vector space. The idea is that since any vector can be expressed as a linear combination of its n bases, we only need to know how these n bases transform. Also since this transformation is linear the weights of the bases remain unchanged. Hence we would also try to express our signal in terms of bases. 1. impulse signals 2. exponential signals For now we would try to express any arbitrary signal in terms of impulse signals. Since we know that any discrete time system is a transformation of a sequence, if its linear then I only need to know how my shifted impulses behave after inputting them to my system. But still I would need to know infinite number of bases and how they get transformed. To overcome this we add another property to our system shift invariance, which states that the shifted impulse signals give shifted impulse responses, so it is enough to know only h[n] to be able to characterise the system. On writing the equations we realise that the idea of convolution sum came from here. When we use complex exponentials as our bases then we observe that not only the no of bases are finite but also they are eigenfunctions to the transformation matrix of the system.