Whatever Goes...

		The evil of feof!!!
	"feof" tortures me for long time! Because I simply don't understand what I read! Do you see the stupid bug following?
What do you know the meaning of "feof"? It only gives the true value when you TRY to PASS the end of file, not BEFORE! I think,
probably very few people will make this kind of stupid mistake like me. However, I just post it here for a memorandom!
 
#include <stdio.h>
#include <stdlib.h>

char* fileName="yourFile.txt";
const int BufferSize=20;
char buffer[BufferSize+1];

int main()
{
	FILE* stream;
	char ch;
	char* ptr;
	int counter=0;
	if ((stream=fopen(fileName, "r"))==NULL)
	{
		printf("Cannot open file %s\n", fileName);
		exit(1);
	}
	while (!feof(stream))
	{
		ch=fgetc(stream);
		buffer[counter++]=ch;
		if (counter==BufferSize)
		{
			buffer[counter]='\0';
			printf("now output string of length %d : %s\n", BufferSize, buffer);
			counter=0;
		}
	}
	if (counter!=0)
	{
		buffer[counter]='\0';
		printf("now output last line of length %d : %s\n", counter, buffer);
	}
	printf("however this is not end!!! What I mean is that there is a big bug");
	printf(" which I always ignore: There is an evil character at end of this line");
	printf(" let me show you in ASCII number, and pay attention to last one!\n");
	ptr=buffer;
	while (*ptr!='\0')
	{
		printf("char:%c = ASCII: %d\n", *ptr, *ptr);
		ptr++;
	}
	return 0;
}


If you don't see any problem, just look at the running result below:
now output string of length 20 : The feof routine (im
now output string of length 20 : plemented both as a
now output string of length 20 : function and as a ma
now output string of length 20 : cro) determines whet
now output string of length 20 : her the end of strea
now output string of length 20 : m has been reached.
now output string of length 20 : When end of file is
now output string of length 20 : reached, read operat
now output string of length 20 : ions return an end-o
now output string of length 20 : f-file indicator unt
now output string of length 20 : il the stream is clo
now output string of length 20 : sed or until rewind,
now output string of length 20 : fsetpos, fseek, or
now output string of length 20 : clearerr is called a
now output last line of length 10 : gainst it
however this is not end!!! What I mean is that there is a big bug which I always ignore: There is an
evil character at end of this line let me show you in ASCII number, and pay attention to last one!
char:g = ASCII: 103
char:a = ASCII: 97
char:i = ASCII: 105
char:n = ASCII: 110
char:s = ASCII: 115
char:t = ASCII: 116
char: = ASCII: 32
char:i = ASCII: 105
char:t = ASCII: 116
char: = ASCII: -1
Press any key to continue
***********************************
After quite a few years later, I take a look at what I was doing before. The problem can be fixed by using 
if ((ch=fgetc(stream))!=EOF) break;
The "feof" is not a "peek" operation, but you have to do the actual fgetc or fread. There is no "peek". i.e. no reading 
but only testing. No such thing at all!
*************************************
 
	

            The latency of disk head moves...

So, everybody who already has taken computer architecture course knows that the average latency time for disk arm to move to the target track is 1/3 of time it moves across all tracks. But how to calculate this? Honestly I don't know. I write a simple program to simulate and convince myself. However, there is one pre-condition for all kinds of "average waiting time" problems no matter whether the waiting elements are processes or tasks or threads or whatever. The length of queue must be stable otherwise there is no average waiting time at all.  Just imagine the queue is continuing to increase as the output is much slower than input. Can you predict the average waiting time? Increasing to infinite?  That is why I write the second version to test under different the incoming rate. The result is exactly 1/3 if the tasks are sparse and few.

This is a very simple question. However, as you know, as I have little knowledge of integral,  (amazing?) it becomes a very difficult problem for me.

After lunch I visit Mr. Zhu and understand the way of integral. The "latency" has nothing to do with "waiting time". They are completely different things!!! Latency is simply the time needed to reach the point by moving disk head. We simply don't have to consider the waiting time for each task.

proof:

                                                 

center                                                                                                     edge

|---------------n tracks-------------|

|-----------------------------------------------------N tracks ------------------------------------------------|

1. Assume disk head is at position with h% number of tracks away from center of disk. That is, if the track number is N, disk head is at a position with n tracks away from center. And h=n/N %.

2. For any task in a position of x% of total tracks away from center, the disk head will take abs(h-x) of time

to reach it. The time of abs(h-x) is actually the major factor of latency.

3. Since the task is equally likely to be any position in the disk, we can use integral to calculate the latency

as following:  (Let's define S(exp, upper, lower)dx as integral(S) for expression(exp) from "lower" to "upper")

Latency(h) = S(h-x, 0, h)dx + S(x-h, h, 1)dx = h^2 - h +1/2

4. Now we want to integral for disk head along all tracks:

Latency in average = S(h^2 -h +1/2, 0, 1) = 1/3

 

Minimize DFA

> Blum(a German Prof.) Algorithm like Hopcroft Alg. :
>
> DFA-->DFA(min)
> Input: reduced DFA M = (Q,Sigma, delta, q0,F).
> Output: minimal DFA M´ = (Q´, Sigma, delta´,qo´,F´).
> Method:
>    (1)  t:=2; Q1:= F; Q2 := Q\F.
>   (2)  while there is i <= t, a <- Sigma
>         with delta(Qi, a) /<= Qj for all j <= t
>          do
>         1.  choose so one i <= t, a <-Sigma and j <= t
>              with delta(Qi, a) intersection Qj /= {}.
>         2.  Q(t+1) := {q <- Qi | delta (q, a) <- Qj};
>             Qi  := Qi\ Q(t+1);
>              t := t+1
>     od.
>  (3) Q´ : ={Q1,Q2,...,Qt};
>       q0´ := [q0];
>       F´:= {[q] <-Q´ |q <- F};
>      delta´( [q],a) := [delta(q,a)]
>      for all q <-Q, a<-Sigma.
>
 

Fascinating!!
Is this algorithm called Blum algorithm? I never heard aobut that. Maybe in
North America, they are called in different names. Anyway, the notation of
pseudo code puzzled me for a while and finally I thought I figured out,
please verify if it is the idea:
1. The first step is to separate DFA into two collections, one contains the
final state F and the other one is its complement.
2. Continual iterating with all collections to find any two collections A,B,
such that for any particular sigma character 'a', NOT all the arches of  'a'
from collection A goes to other collection B. (It means the states in
collection A have some arches of 'a' pointing to other collection other than
B. ) And we try to divide the collection A into two sub-collection such that
the states in A with arches with label 'a'  pointing to B is a new
sub-collection  C, and A - C is another collection.
3. Finally we got our reduced DFA such that the final states is the
collection of collections such that they contains original final state F.
The transition functions, or delta, or arches, are such that the domain is
the closure of states q, or those collections which contains q, and the
range is the closure of state delta(q,a), in other words, the collections
which contains state of
delta(q,a).

Is my understanding right?
 

My miserable life in Concordia...

#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>

typedef int Exception;

jmp_buf jumpBuf;

#define CourseNumber 4

#define E_chem205 1
#define E_comp326 2
#define E_comp444 3
#define E_stat250 4

char* courseName[CourseNumber]=
{
	"chem205", "comp326", "comp444", "stat250"
};

typedef  void CourseFunc(void);


void chem205();
void comp326();
void comp444();
void stat250();

//function pointer array
CourseFunc* myCourse[CourseNumber]=
{
	chem205, comp326, comp444, stat250
};


int main()
{
	int week,course;
	Exception exp;
	//I have to suffer 13 weeks for a term
	for (week=0; week<13; week++)
	{
		printf("****************\nnow it is week %d\n****************************\n", week+1);
		for (course=0; course<CourseNumber; course++)
		{
			myCourse[course]();
			if ((exp=setjmp(jumpBuf))!=0)
			{
				if (exp==E_comp444)
				{
					printf("I don't believe you, just quit\n");
					exit(E_comp444);
				}
				else
				{
					printf("I know you are suffering from %s,", courseName[exp-1]);
					printf("you still have to hold on\n*************************\n");
					//do nothing but continue
				}
			}
			
		}
	}
	return 0;
}


//there is 80% chance, I would complain----throw an expception for protest
void chem205()
{
	if (rand()%100<80)
	{
		printf("chemstry has too many compound name to memorize\n");
		printf("I want to throw an exception\n");
		longjmp(jumpBuf, E_chem205);
	}
	printf("I continue studying chem205\n");
}

//there is 60% chance, I would complain----throw an expception for protest
void stat250()
{
	if (rand()%100<60)
	{
		printf("statistic has too many integral caculus\n");
		printf("I want to throw an exception\n");
		longjmp(jumpBuf, E_stat250);
	}
	printf("I continue studying stat250\n");
}

//there is 90% chance, I would complain----throw an expception for protest
void comp326()
{
	if (rand()%100<90)
	{
		printf("computer architecture is really boring\n");
		printf("I want to throw an exception\n");
		longjmp(jumpBuf, E_comp326);
	}
	printf("I continue studying comp326\n");
}

//there is 10% chance, I would complain----throw an expception for protest
void comp444()
{
	if (rand()%100<10)
	{
		printf("system software design is exciting\n");
		printf("if I throw exception, I know I would quit all\n");
		longjmp(jumpBuf, E_comp444);
	}
	printf("I continue studying comp444 should I complain about anything?\n");
}	

 

****************
now it is week 1
****************************
I continue studying chem205
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 2
****************************
I continue studying chem205
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
I continue studying stat250
****************
now it is week 3
****************************
chemstry has too many compound name to memorize
I want to throw an exception
I know you are suffering from chem205,you still have to hold on
*************************
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 4
****************************
I continue studying chem205
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 5
****************************
chemstry has too many compound name to memorize
I want to throw an exception
I know you are suffering from chem205,you still have to hold on
*************************
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 6
****************************
chemstry has too many compound name to memorize
I want to throw an exception
I know you are suffering from chem205,you still have to hold on
*************************
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 7
****************************
I continue studying chem205
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 8
****************************
chemstry has too many compound name to memorize
I want to throw an exception
I know you are suffering from chem205,you still have to hold on
*************************
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 9
****************************
chemstry has too many compound name to memorize
I want to throw an exception
I know you are suffering from chem205,you still have to hold on
*************************
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
I continue studying stat250
****************
now it is week 10
****************************
I continue studying chem205
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 11
****************************
chemstry has too many compound name to memorize
I want to throw an exception
I know you are suffering from chem205,you still have to hold on
*************************
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 12
****************************
I continue studying chem205
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************
****************
now it is week 13
****************************
chemstry has too many compound name to memorize
I want to throw an exception
I know you are suffering from chem205,you still have to hold on
*************************
computer architecture is really boring
I want to throw an exception
I know you are suffering from comp326,you still have to hold on
*************************
I continue studying comp444 should I complain about anything?
statistic has too many integral caculus
I want to throw an exception
I know you are suffering from stat250,you still have to hold on
*************************

 

How to convince myself that forked process share the file status table with parent process?

1. "exec.o" open a file and fork a child process. The child process then "exec" a new program "hello.o" which accept the file descriptor passed by

forked process as command line parameter. The new program then write more into the file referenced by the passed file descriptor.

2. In order to prove parent process and child process have the shared file status table, I asked parent process to write more lines after it forked a

child process. By doing this, the file offset should be different from the moment when it forked the child. But the child process write into file and the

content are seamless aligned. i.e. the file offset of both parent and child are exactly same.

 

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/types.h>

int main(int argc, char* argv[])
{
	pid_t pid;
	int fd;
	int n=10;
	char buf[20];
	char nameBuf[30];
	char fileName[]="this is string buf";
	while (n>0)
	{
		sprintf(nameBuf, "tempParent%d", n);
		if (open(nameBuf, O_CREAT, S_IRUSR|S_IWUSR)<0)
		{
			printf("open error for parent\n");
		}
		n--;
	}

	if ((fd=open(argv[1], O_CREAT|O_WRONLY|O_SYNC|O_TRUNC,
		S_IRUSR|S_IWUSR|S_IRGRP))<0)
	{
		printf("open error,\n");
		exit(7);
	}
	//
	//printf("sizeof(%s)=%d\n", fileName, sizeof(fileName));
	if (write(fd, fileName, strlen(fileName))!=strlen(fileName))
	{
		printf("probably write error\n");
	}
	printf("fd is still %d\n", fd);
	if ((pid=fork())<0)
	{
		printf("fork error\n");
		exit(1);
	}
	if (pid==0)
	{
		sleep(5);
		//while (n>0)
		{
			printf("I am child and printing\n");
		}
		sprintf(buf, "%d", fd);
		//printf("argv[2]=%s\n", argv[2]);
		printf("argv[2]=%s and buf=%s\n", argv[2], buf);
		//only 
		//strcpy(nameBuf, "./");
		//strcat(nameBuf, argv[2]);
		n=4;
		while (n>0)
		{
			sprintf(nameBuf, "%s%d", "temp", n);
			if (open(nameBuf, O_CREAT, S_IRUSR|S_IWUSR)<0)
			{
				printf("open error\n");
			}
			n--;
		}
		if (execlp(argv[2], argv[2], buf, NULL)<0)
		{
			printf("exec error\n");
			exit(3);
		}
		printf("will you see this?\n");
	}
	else
	{
		printf("fd is still %d\n", fd);
		//printf("I am parent and I will write more while child is sleeping...\n");
		strcpy(buf, "some more from parent");
		printf("fd=%d, strlen(buf)=%d\n", fd, strlen(buf));
		if (write(fd, buf, strlen(buf))!=strlen(buf))
		{
			printf("write error from parent\n");
		}
		lseek(fd, 0, SEEK_SET);
		close(fd);	
		exit(0);
	}
	return 0;
}

 

 

 

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>

int main(int argc, char* argv[])
{
	int fd;
	char* string="this is child";
	printf("%s and argv[1]=%s\n", string, argv[1]);
	sscanf(argv[1], "%d", &fd);
	printf("fd=%d and string=%s", fd, string);
	printf("file offset is %d\n", lseek(fd, 0, SEEK_CUR));
	if (write(fd, string, strlen(string))!=strlen(string))
	{
		printf("error or write\n");
		exit(7);
	}
	return 0;
}

 

[root@sec05 mywork]# ./exec.o ./result.txt ./hello.o
fd is still 13
fd is still 13
fd=13, strlen(buf)=21
[root@sec05 mywork]# I am child and printing
argv[2]=./hello.o and buf=13
this is child and argv[1]=13
fd=13 and string=this is childfile offset is 0

This is what is inside file "result.txt". Pls note that by calling "lseek" in parent process, it causes child process overwrite contents written by parent.

this is childg bufsome more from parent

    To convince myself the signal works and also practice function pointers and pointer to function pointers

#include <signal.h>
#include <stdlib.h>

#define SigErr  (void(*)(int))-1
#define SigDef  (void(*)(int)) 0
#define SigUsr   (void(*)(int))1

void myHandler(int sigNo);
typedef void (*FuncPtr)(int);


FuncPtr regists(FuncPtr sigHandler, int sigNo);

typedef void (**FuncPtrPtr)(int); 

int main()
{
	//FuncPtrPtr ptr;
	FuncPtr* ptr;
	FuncPtr result;
	int n=0;
	
	result=regists(myHandler, SIGUSR1);
	ptr=&result;

	printf("result is 1:%d\n", result);
	//result(SIGUSR1);
	result=regists(myHandler, SIGUSR1);
	printf("register again and result should be my handler :%d\n", result);
	(*ptr)(SIGUSR1);
	//result(SIGUSR2);
	result=regists(myHandler, SIGUSR2);
	


	printf("pid=%d\n", getpid());
	for ( ; ; )
	{
		printf("no. %d\n", n);
		n++;
		pause();
	}
	return 0;
}

FuncPtr regists(FuncPtr sigHandler, int sigNo)
{
	FuncPtr result;
        if ((result=signal(sigNo, sigHandler))==SigDef)
        {
                printf("default handling\n");
        }
        else
        {       
                if (result==SigErr)
                {
                        printf("error of register\n");
                }
                else
                {
                        if (result==SigUsr)
                        {
                                printf("signal user\n");
                        }
                        else
                        {
				//should never be here
                                printf("lets see what handler is:%d\n", result);
                                result(sigNo);
                        }
                }  
        }
	return result;
}


void myHandler(int sigNo)
{
	printf("the signal is %d\n", sigNo);
}

The running result is like following:

[qingz_hu@alamanni ~/mywork] % gcc signal.c -o signal.exe
[qingz_hu@alamanni ~/mywork] % ./signal.exe
default handling
result is 1:0
lets see what handler is:134513992
the signal is 10
register again and result should be my handler :134513992
the signal is 10
default handling
pid=31720
no. 0

[qingz_hu@alamanni ~/mywork] %
[qingz_hu@alamanni ~/mywork] %
[qingz_hu@alamanni ~/mywork] %
[qingz_hu@alamanni ~/mywork] %
[qingz_hu@alamanni ~/mywork] %
[qingz_hu@alamanni ~/mywork] % clear
[qingz_hu@alamanni ~/mywork] % ./signal.exe &
[1] 31775
[qingz_hu@alamanni ~/mywork] % default handling
result is 1:0
lets see what handler is:134513992
the signal is 10
register again and result should be my handler :134513992
the signal is 10
default handling
pid=31775
no. 0

[qingz_hu@alamanni ~/mywork] % kill -USR1 31775
the signal is 10
no. 1
[qingz_hu@alamanni ~/mywork] % kill -USR2 31775
the signal is 12
no. 2
[qingz_hu@alamanni ~/mywork] % kill -USR1 31775
the signal is 10
no. 3
[qingz_hu@alamanni ~/mywork] % kill -USR2 31775
the signal is 12
no. 4
[qingz_hu@alamanni ~/mywork] % kill -20 31775

[1] + Suspended ./signal.exe
[qingz_hu@alamanni ~/mywork] %
 

The function pointer array is manipulated like this: Actually it puzzles me several hours by figuring out how to declare a function pointer of which the

return type is a function and the parameter includes a function pointer.

i.e. HandleType regists(int num, HandleType funcPtr);  is a declaration of this kind of function, but how to declare a type for this function.

I must admit it is extremely confusing for me to do this without "typedef" a HandleType first. (Actually I don't know how after many times trials.)

 

#include <stdlib.h>


typedef void (*HandleType)(int);

typedef HandleType  (*FuncPtr1)(int, HandleType);


HandleType regists(int num, HandleType funcPtr); 

void display();

void handler1(int num);
void handler2(int num);
void handler3(int num);
void handler4(int num);
void handler5(int num);

HandleType funcArray[5]=
{
	handler1, handler2, handler3, handler4, handler5
};

int main()
{
	FuncPtr1 ptr1;
	int i;
	//FuncPtr2 ptr2;
	HandleType result;
	printf("initial condition of funcArray\n");
	display();
	ptr1=regists;
	result=handler5;
	printf("let's register handler5 at the beginning and shift all\n");
	for (i=0; i<5; i++)
	{
		result=ptr1(i, result);
	}
	printf("after register, it should be like round robin\n");
	display();
	return 0;
}
	

void handler1(int num)
{
	printf("This is handler 1 and handling %d\n", num);
}

void handler2(int num)
{
	printf("This is handler 2 and handling %d\n", num);
}

void handler3(int num)
{
	printf("This is handler 3 and handling %d\n", num);
}

void handler4(int num)
{
	printf("This is handler 4 and handling %d\n", num);
}

void handler5(int num)
{
	printf("This is handler 5 and handling %d\n", num);
}

void display()
{
	int i;
	for (i=0; i<5; i++)
	{
		funcArray[i](i);
	}
}

HandleType regists(int num, HandleType funcPtr)
{
	HandleType result;
	printf("this is a function takes parameter of integer and a function pointer\n");
	printf("it also returns a function pointer\n");
	result=funcArray[num];
	funcArray[num]=funcPtr;
	return result;	
}
 
This is the running result:

[qingz_hu@alamanni ~/mywork] % ./funcptr.exe
initial condition of funcArray
This is handler 1 and handler handling 0
This is handler 2 and handler handling 1
This is handler 3 and handler handling 2
This is handler 4 and handler handling 3
This is handler 5 and handler handling 4
this is a function takes parameter of integer and a function pointer
it also returns a function pointer
this is a function takes parameter of integer and a function pointer
it also returns a function pointer
this is a function takes parameter of integer and a function pointer
it also returns a function pointer
this is a function takes parameter of integer and a function pointer
it also returns a function pointer
this is a function takes parameter of integer and a function pointer
it also returns a function pointer
after register, it should be like round robin
This is handler 5 and handler handling 0
This is handler 1 and handler handling 1
This is handler 2 and handler handling 2
This is handler 3 and handler handling 3
This is handler 4 and handler handling 4

How big difference is there between 2005-02-07
different UNIX system?

Referfing to theory assignment question 5, I know as long as the API's interface are same, at least they are compatible at level of source code. That's is by re-compiling the source code we can run the program in variation of UNIX. My question is that is it true the variation of UNIX is so different that we have to re-compile source code. i.e. some program in windows are also compatible with UNIX at level of source code.

What is the major difference between different variation of UNIX, from the perspective programmer? Different data alignment, big Indian, little indian? Different size of data type, like integer size?
UNIX variations 2005-02-09
I'm not sure that I completely understand your question. The question on the assignment referred to the fact that if we use system calls on a POSIX compliant OS, we can expect to be able to port the source code to another POSIX compliant OS and recompile and run our code without problems (at least with respect to the system calls). This is the case because even though the calls are implemented differently, the name, parameters, return value, and meaning (i.e., semantics) are the same. Since Windows is "somewhat" POSIX compliant (exactly how much I'm not sure), we may often be able to take source from a Unix system and recompile it on Windows without too much of a problem.

Re-compilation is necessary, however, even with two POSIX compliant system since the hardware (and hence instruction set) may be different, as well as the executable file format. The things that you mention - big endian, integer size, alignment - are all related to hardware and are usually irrelevant to the programmer. The endianess issue, for example, is handled by the compiler on each platform. There are exceptions to this, of course. The limits.h header file, for instance, lists things like maximum size of an integer which may be important to a particular program. You may also have to worry about endianess if you are creating files that must be read on different hardware platforms. But these are the exceptions. In general, you don't worry about these issues when porting code between version of UNIX.

Dr Eavis
How to declare type of function "signal" 2005-02-12
I always suspect the declaration of "signal" function is actually a function pointer. And when I tried to declare a type of such function pointer, I run into difficulty. What I want to declare is a function pointer type such that its return type is a function and one of its two parameter is also a function pointer. i.e. This won't work:

typedef void(*MyType(int, void(*)(int)))(int);

The compiler gives error of something like declaring a function returning a function. So, I guess compiler thinks I am declaring a variable instead of type definition?? Later I finally divide the job into two step and it works. But I just don't know why.

typedef void (*HandleType)(int);

typedef HandleType (*FuncPtr1)(int, HandleType);
OK, I think I figured out, 2005-02-12
By adding one extra pair of parenthesis, compiler understands my intention that return type is a function: void (*)(int).

And the type definition is inside another parenthesis:

((*MyType)(int, void(*)(int)))

i.e. It finally is something like this:

typedef void (*((*MyType)(int, void(*)(int)) ) )(int);

 

bad programming habit

share file after fork() 2005-02-12
I write code as follows and get unexpected result. Any help?

#include

#include

#include

#include

int main(int argc, char *argv[])

{

int fd;

unlink("fork");

fd = open("fork", O_CREAT|O_WRONLY|O_SYNC, S_IRWXU);

printf("this is main\n");

switch (fork())

{

case -1 :

perror("fork()");

exit(1);

break;

case 0:

write(fd, "this is fork\n", 60);

exit(1);

break;

default:

write(fd, "this is parent\n", 20);

exit(1);

break;

}

return(0);

}

content in file fork:

this is parent

this is fork

this is parent

question:

where the second "this is parent" from?

 
This is probably due to a bad habit 2005-02-12
I tried the code with little modification and everything works fine. And my explanation is that your puzzle probably due to a bad habit of programming.

The only difference I made in my code is that the number of bytes I write is exactly the length of the buffer. i.e.

...

char* str1="this is parent";

...

write(fd, str1, strlen(str1));

...

Please note that in your code you unnecessarily ask write to write 60 bytes and you are probably write some extra garbage into file. If my guess is correct, then when you write "this is fork", you are probably also write string "this is parent\n" into file since you are write length of 60bytes. Usually string like "this is parent" are treated as constant data by compiler and probably both "this is fork" and "this is parent" strings are allocated in "initialized data segment" together. There are very big chance that they are allocated in neighbour memory address. So, by coincidence you think you write one string twice.

This is my guess, however, I am pretty sure it is the problem. You can try to modify your code and test.

n.h.
fork 2005-02-14
Yes, n.h. is correct. The two strings are likely to be right next to one another in memory and because you use a large count value, both strings get written at once in the child process.

Dr Eavis

memory address space

About function dynamic library 2005-02-13
What I understand is that if our executable file dynamically links with some library, then the code text in library will not be counted in memory space of our executable program. i.e. The address of symbols in function of library will not be calculated as relative offset of the beginning address of our program. In other words, the memory of library function is outside of limit of virtual memory space, say 4G. Is this correct?

Another question puzzles me if the above is correct. When we pass parameter to library function, where does system store the parameter? Obviously it is not passed to our program's user space stack because I think library function is not even located in our local stack frame. Then is it possible that it is located in each program's kernel stack? This sounds reasonable when I recall it is mentioned in lecture. But I am not so sure.

Thank you sir.

 
process memory 2005-02-14
All library code, whether statically or dynamically linked, will be part of our address space. In the dynamic case, the addresses of library functions must be "fixed up" at run-time. But ultimately, they must still be part of that 4GB address space.

By extension, this implies that a common user mode stack is used in all cases.

The real difference is simply that the library code does not become part of the "stored" executable, thereby allowing many user programs to share one copy of the library code...with each process seeing the text segment of the shared library as part of its own address space.

Dr Eavis
dynamic linking 2005-02-14
Thank you sir for your explanation. However, I think I still believe that the library code which is dynamically linked will not be part of 4GB address space of our program. The reason is like this:

1. From what we learned in comp229, we know the functions and variables in library will become unresolved external symbols in executable files after linking because linker has no way to calculate the actual address of library code either because they can be anywhere in memory when program is loaded or it is not the job of linker at this stage since the linking will be accomplished only at runtime. Usually we can even use a "stub" file to represent library code when compiling and the so-called stub file is essentially the header file which gives the interface of library function only. By doing this we have the flexibity to upgrade library without re-compiling executable files as long as the interface of library remains unchanged. So, we know for sure the implementation of library code is not needed at the stage of compilation.

2. At runtime, the linking-loader will resolve those unresolved external symbols, the functions in library, by help of operating system. And those symbols will only need to be changed with a pointer which points to actual "absolute" address of functions in library when library is loaded in memory. And PC(program counter)jumps to the entry of functions and when function finishes PC jumps back to the next instruction before function call. There is no need and no possibility for our program to "reference" anything other than the "name of function". So, the address referenced in code for the name of function in library will be only using "absolute" address instead of relative address. And I think this is like the "long jump" in early pascal language when you call some function located in other module. But the difference is that "long jump" will be resolved during linking stage when different modules are combined together and all module are allocated with memory address within 4GB. Whilst dynamic linking can never determine the memory address of library until runtime and the address of library is obtained by help of operating system whenever the function name is called. My opinion is that the function name of library is always treated as an external symbol and it is not part of 4GB memory space of our program. Because "absolue" address is not counted in any part of virtual memory address.
absolute address 2005-02-14
After I posted my opinion above, I searched again for the definition of "absolute" address mode and I am sorry I am wrong about concept of "phycial memory address" and "absolute address". "Absolute address" is still part of virtual memory address which is mapped by VMM. However, my argument above remains unchanged when replacing absolute address with physical address. Because we know the implementation of library is actually independent from our executable files. That is to say, after upgrading of our library, the size of code maybe vary and still our program can run smoothly. This implies we don't need care about the address or size of library function. Therefore it proves the library is outside 4GB memory address.

 

 

pre-emptable and interruptable?

So, system call can be interrupted 2005-02-14
In page275, it is said that system call can be interrupted if it happens to be a "slow" one. I then recall the first assignment about preemption of kernel code since system calls are in mode of kernel, right? I get a bit confused about these two concepts: "interruptable and preemptable." Does the fact that system call can be interrupted suggest that it can be preempted? Or is term "preemption" only applicable for inter-process instead of functions within one process space?

I check again with solution of first assignment and guess the term "reentrant" is very similar to "interruptable" because it is due to "interrupt"(soft or hardware) which enables "reentrance". Is it right?
syscall interruption 2005-02-14
Yes, these terms are a little confusing since they refer to similar issues. In terms of pre-emption, this refers to the notion of a process being "stopped" so that another process can run. This can certainly happen with user-mode processes since it is one of the primary scheduling mechanisms. For most kernels, however, it is not possible for executing kernel code to be pre-empted so that another user mode process can be run. Real-time OSs like Solaris are exceptions to this rule (from what I understand, the new Linux 2.6 kernel may also provide some degree of kernel pre-emption as well)

Kernel calls are interruptable, however. So when a signal arrives during the execution of a system call, the traditional response has been to terminate the syscall with an appropriate error return code. But this causes no problem with kernel "corruption" because the interrupt would only be delivered while the system call was blocked...say while waiting for the disk. At this point, system data structures are in a consistent state. In other words, the sys call doesn't get interrupted in a random location. And it makes sense to interrupt a "slow" system call at this point since the signal is likely to be something that has to be addressed immediately. But keep in mind that all of this is happening to one process. We are not pre-empting one process to run another.

Re-entrance refers to a multi-process (or mutli-function) situation. For example, if one process blocks in a system call, it is possible for a second process to get the CPU and then execute that same system call. If a function is re-entrant, it can allow this without risk of corrupting or re-writing any internal variables or data structures.

Dr Eavis

 

My second assignment:

Questions:

1. File permission bits are generally easy to understand. However, directory
permissions are a little less intuitive. Explain how the executable bit is used
for directories and why it is different than the read bit.
2. When we open a file, we have the option of specifying the O_SYNC flag. We
can also obtain synchronization functionality by using the sync system call.
What is the difference between the two and what effect do they have on the
data located in the C library buffers. Explain.
3. The chdir system call can be used to modify the current working directory.
Say that we use chdir within a program to move to a completely new
directory. Our program then creates another process (say with fork) that
shares the same cwd. After our program terminates, however, and we get our
command prompt back, we’re still in the same directory that we were in
before we ran our program. Why does this happen?
4. With respect to the standard C libraries, explain how it is possible to do a lineoriented
write (e.g., fputs) that does not cause a line of data to be sent to the
kernel.
5. We saw with fopen that this C library call invokes the system call open (and
adds some extra functionality). Now, the OS provides a positioning function
called lseek. The C libraries provide a positioning function called fseek. So is
fseek a wrapper for lseek or is it possible to invoke fseek without lseek ever
being called?
6. There are two types of libraries: static and shared. The book suggests that
shared libraries make our executables smaller. Let’s say that the standard C
libraries could be compiled as either static or shared libraries. With respect to
the diagram on page 168 (the process space), explain the effect of using the
static version versus the shared version.
7. We said that the file system tries to ensure that directory inodes are spread
more or less evenly across block groups. This sounds nice but, really, why is
it useful?
8. Let’s say that we are using an ext2 file system that uses a block size of 1024
bytes. We create a file that contains 856,000 bytes. How will the file’s inode
keep track of these blocks? Be specific here and show the math.
9. I have just purchased a new hard disk that holds 80 GB of data and I have set
up a partition that is 30 GB in size. I install the ext2fs in this partition,
specifying a block size of 2048 bytes. What is the maximum number of data
blocks that I can store in a block group with this configuration?

 

  1. If only the “read” bit is set for a directory, then it can only be listed with the file names contained in the directory. (i.e. “ls directory-name”. And even the switch “-l” cannot be added since only the name of “dirent” can be retrieved. In order to retrieve more information like access time, size etc, the directory needs to be “executed” by some system call like “stat” etc.) Without setting the “execution bit”, all the files inside the directory cannot be executed and even the directory itself cannot be executed. Then we cannot open any file and even cannot list detailed information of contained files in directory.
  2. By specifying “O_SYNC” flag, every write call will only return until all data are written to O.S. I/O buffer and this is an atomic operation compared with invoking “sync” system call. i.e. Between “write” call and “sync”, there maybe context switch happened between processes and with flag “O_SYNC” this won’t happen. System calls like “write”, “read” are un-buffered call and they don’t use C library buffer. Instead they write directly into system I/O buffer. There is no direct relation between system call and C library buffer except that usually C library file API uses system calls for implementation. So, when C library function like “fread”, “fwrite” is called, internally system call like “read” and “write” is often called and C library buffer may be read and written.
  3. Our current working directory is one of environment parameter recorded in system data file, i.e. “passwd” file. And the shell always direct user to its “CWD” in user prompt. So, even in the program, “chdir” is called, after the program finishes, the shell will still retrieve user’s current working directory information from environment parameter file and direct user prompt into its CWD indicated by environment parameter.   
  4. In C library there is an internal buffer to hold data ready to be written into operating system I/O buffer. Usually this buffer has a fairly large size compared with length of a common line, say with size of 256. Therefore when “fputs” is called, say a line of string shorter than size of buffer is written into this internal buffer. As long as the buffer is not full, C library will not flush contents in buffer to system I/O buffer.
  5. In C library, a structure called FILE is used to maintain file-related information. One of its fields is used to record current offset of file operation. Therefore when “fseek” is called, C library just calculates the offset field of FILE structure without invoking “lseek” call. i.e. when user want to forward n bytes from current position, library just call “read” (n/buffer_size +1) times to read contents into internal buffer. More possibly when the number of bytes user wants to move is smaller than size of internal buffer, C library just modify the current pointer pointing to the position of internal buffer without making any system call.
  6. When we use static linking to compile our executable file, compiler will compile all functions and variables in library into our executable file. That is to say, when the program is loaded into memory the code in library will also be part of “text”, “initialized” ,“uninitialized” segment since these functions and variables are treated no difference from other codes. Of course when program is running, those library functions are also part of stack frame of process in the “stack” segment. The function and variables in library are also part of memory space in our program. Therefore the size of run-time process is much large. In contrast, when we use dynamic linking, the functions in library will be represented only as external symbol which will only be linked and resolved at run-time by dynamic loading and linking done by operating system. Since functions and variables in library will ONLY be symbols in “text” segment in executable files when stored in disk and memory pointers resolved by loader_and_linker when program is loaded into memory, the size of executable is much smaller compared with static linking. At run-time, functions and variables in library are not even located in memory space of process. i.e. Their address is not treated as relative offset from beginning address of our program module. So the size of run-time is also reduced.
  7. This is to make file access performance better. When users access directory, their real purpose is to try to access files contained in directory or retrieve information of those files, i.e., read, write contents in file or browse information like size, access time of files. By “principle of locality”, if directory and files in it are located in same block group, the O.S. can achieve a better performance in a sense that I/O operation is quite a time-consuming operation. In order to allow directory and its containing files in same block groups, it is necessary for file system NOT to store directories’ inodes into one particular block group. Otherwise when number of files contained in each directory increases, one block group will be unable to hold those data blocks and inodes of new files. Therefore to cope with increase of number of files in each directory, file system usually needs to allocate directory inodes averagely across whole block groups.
  8. In the inode of a file, there is one field called data block pointer which consists of 15 data block pointers. The first 12 of them are direct pointers of which each points to one data block. The rest 3 of them are indirect, double indirect and triple indirect pointers.

Number of pointers in one block: 1024 / 4 = 256(pointers/block)

a)  856000/1024=835.9(blocks)   //it exceeds 12 direct pointers

b)  (836-12)/256 = 3.22 >1     //it exceeds indirect pointer

c)  (836-12-256) /256 = 2.2 <256    //it falls within double indirect pointer

As shown above, 12 direct pointer and indirect and double indirect pointer are used to bookkeeping total 836 data blocks. Assuming the index of data block starts from 0, when the index of data block is smaller than 12, the index of block is the index of direct data pointer. When the index of block is within range between 12 and 12+256-1=267, the address is in the pointer block pointed by indirect data block pointer. When index of block is in range between 268 and 836, the address of data block can be retrieved by following double indirect data block pointer.

  9.  30G/2K/(2K x 8) = 960(blocks)

    Signal!Signal!!Signal!!!

Is signal queued in process table's list 2005-02-23
I am not sure if I have asked the same question. However, my test and answer from different source still conflicts.

1. Assume a parent process forks several child processes and goes to sleep.

2. During period when parent is asleep, each child process sends parent a signal, say SIGUSR!, an arbitrary times. When parent process wakes up, how many times does the handler of signal SIGUSR1 have been executed?

a) Is it the sum of times each child sends? Obviously not and my test confirms this.

b) Is it number of child processes? I am not sure because it doesn't make much senses. Since I only test two or three children cases, I am not sure it is by coincidance or not.

c) Is it only ONCE? If the signal list is implemented by a bitmap, this should make senses. Because no matter how many signals of same type are received, it only show one occurrance.

d) Is it an undefined number? My test seems to favour this assumption. But I still don't understand the reason.

The following is my test program and the running results.
 
#include <stdio.h>
#include <signal.h>
#include <sys/types.h>
#include <time.h>
int count=0;
int signalCount=0;
void handler(int no);

void childCount(int no);

const int ChildNumber=10;
int main()
{
	int i;
	pid_t pid;
	srand(time(0));
	if (signal(SIGUSR1, handler)==SIG_ERR)
	{
		printf("error of signal\n");
	}
	if (signal(SIGCHLD, childCount)==SIG_ERR)
	{
		printf("error of signal\n");
	}		
	for (i=ChildNumber; i>0; i--)
	{
		if ((pid=fork())==0)
		{
			sleep(i);
			i=rand()%i;
			printf("child %d sends signal %d times\n", getpid(), i);
			while(i>0)
			{
				kill(getppid(), SIGUSR1);
				i--;
			}
			exit(0);
		}
	}
	sleep(ChildNumber);
	while (count<ChildNumber)
	{
		pause();
	}
	return 0;

}

void childCount(int no)
{
	count++;
	printf("%d child dies\n", count);
}


void handler(int no)
{
	signalCount++;
	printf("user1 signal number %d\n", signalCount);
}

This is the running result when redirected to a file:

child 32303 sends signal 0 times
child 32302 sends signal 1 times
child 32301 sends signal 0 times
child 32300 sends signal 3 times
child 32299 sends signal 1 times
child 32298 sends signal 3 times
child 32297 sends signal 2 times
child 32296 sends signal 7 times
child 32295 sends signal 0 times
child 32294 sends signal 1 times
1 child dies
user1 signal number 1
2 child dies
3 child dies
user1 signal number 2
4 child dies
user1 signal number 3
5 child dies
user1 signal number 4
6 child dies
user1 signal number 5
7 child dies
user1 signal number 6
user1 signal number 7
user1 signal number 8
8 child dies
9 child dies
user1 signal number 9
10 child dies
 

This is the running result when executed from command line. The display has some subtle difference, but the big picture is similar:

child 32463 sends signal 0 times

1 child dies

child 32462 sends signal 1 times

user1 signal number 1

2 child dies

user1 signal number 1

child 32461 sends signal 0 times

3 child dies

child 32460 sends signal 1 times

user1 signal number 2

4 child dies

user1 signal number 2

child 32459 sends signal 4 times

user1 signal number 3

5 child dies

user1 signal number 3

child 32458 sends signal 3 times

user1 signal number 4

6 child dies

user1 signal number 4

child 32457 sends signal 4 times

user1 signal number 5

7 child dies

user1 signal number 5

child 32456 sends signal 5 times

user1 signal number 6

8 child dies

user1 signal number 6

child 32455 sends signal 6 times

user1 signal number 7

9 child dies

user1 signal number 7

child 32454 sends signal 9 times

user1 signal number 8

10 child dies

user1 signal number 8

The behaviour of  sigsuspend(sigset_t*) (A personal reminder)

1. When you fork a new child, the child inherits the signal mask except signal handler. This makes senses since signal handler is just a function existing in parent address space.

2. No matter whatever signal you block, signal mask will be reset by parameter of sigsuspend(sigset_t*). This is stated very clearly, but I simply didn't get it or believe it until I tried by myself.

3. When sigsuspend() is interrupted, it catches some signal which is not in the block list and reset block-mask to original one.

Problem with sigaction 2005-03-02
The tutor for the lab say that it's better to use sigaction instead of signal, as say on the book page 296. So I've put the fonction of the page 298 on my program, but when I compile it, it said:

size of storage for ?act ?is not known, and the same thing for oact. I don't know where is the problem since it's the code given in the book, and I thought it should working.
Have you typed "struct" before sigaction 2005-03-03
In C, the function and structure can use same name and it fools me a couple of times. If you only type "sigaction" without adding "struct" before it, the "sizeof" function will give this error if you are using it. This happened to me once.

 

Some discoveries of "pipe" 2005-03-03
1. You don't have to close one "read" at writer side and close the "write" descriptor at reader side EXPICITLY. They will work fine and I think it is only a good programming habit to prevent other process both reads and writes from the other end of pipe. Because by closing read, pipe cannot be written from other side. (Maybe this is the reason of "half-duplex" since multiple read will destroy the internal pointer.)

2. I deliberately tried to read from both side and write from only one side. Even though both read get the message, however, the later read is not a BLOCKED read anymore. i.e. Every call of read from the second read is always returned immediately.

3. In Linux, "PIPE_BUF" constant is undefined. And what's worse, it seems you can write however big size of data you like. In my case, I tried size of 1024x1024 bytes of buffer for writing. The write and read both works fine. The following is the test program.

4. I dimly remembered in O.S. course, the professor mentioned about "pipe" and the "read" and write operation is synchronized internally. But I am not sure.

5. The logic of my program is like this:

i) create a large buffer and I initialize it by writing "block" number into their position. i.e. no. m is at the position of m*BlockSize. By doing this, I can check the output file if data is corrupted by overlapping.

ii) Parent write the whole buffer into pipe and child read from buffer and write into an output file.

iii)The buffer size should already exceed internal pipe buffer size. However the write call doesn't split its operation to several writes. I know this by counting the reading operation. It is a single write and a single read. I guess O.S. synchronizes the read and write internally, otherwise there is nothing special for "pipe" which is a file.

 
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <fcntl.h>
#include <stdio.h>

const int LineSize=1024*1024;
#define  BlockSize 16
char* outname="out.txt";
const int Permission=S_IRUSR|S_IWUSR|S_IRGRP;
char block[BlockSize]={0};
int fds[2];

int main(int argc, char* argv[])
{
	pid_t pid;
	char buf[LineSize];
	int fdin, fdout, i=0, n=0;
	char msg[20];
	if (pipe(fds)<0)
	{
		printf("pipe error\n");
	}
	
        if ((fdout=open(outname, O_WRONLY|O_CREAT, Permission))==-1)
        {
                printf("write open error\n");
        }
	//initialize buf to be full of number of blocks...
	while (n<LineSize/BlockSize)
	{
		sprintf(block, "no.%d\n", n+1);
		i=strlen(block);
		//just pad space after number
		while (i!=BlockSize-1)
		{
			block[i]=' ';
			i++;
		}
		block[BlockSize-1]='\n';
		
		strcpy(buf+n*BlockSize, block);
		n++;
	}
		
		
	if ((pid=fork())==0)
	{
		while ((n=read(fds[0], buf, LineSize))>0)
		{
			if (write(fdout, buf, n)!=n)
			{
				printf("write to output error\n");
			}
			sprintf(msg, "child read %d\n", n);
			if (write(STDOUT_FILENO, msg, strlen(msg))!=strlen(msg))
			{
				printf("display error\n");
			}		
		}
	}
	else
	{
		if (write(fds[1], buf, LineSize)!=LineSize)
		{
			printf("write error\n");			
		}
	}
	return 0;
}
	
The signal SIGPIPE is necessary 2005-03-04
Regarding my question in class about SIGPIPE, it seems the signal is necessary because there is nothing wrong for the writer to write even though the reader has already closed the read end. The system call "write" just succeeds in writing and no error is given. This design is quite logical and in consistent with the semantics of write call. It is up to the application to decide whether this seemingly useless write is an error or not. And I guess maybe the system needs to maintain "write" as an atomic operation which is done without checking first and after write is done system checks read's end. Then the signal is generated.
sigpipe 2005-03-05
It seemed odd to me that a write error could not be produced in this situation. After all, if there is a problem writing, then it should be just as easy to return a write error as to generate a SIGPIPE. So I did a little digging and it appears that a write error IS returned when you try to write to a pipe that has been closed at the other end. Specifically, it is the EPIPE error. So an error AND a SIGPIPE signal are generated. I did a quick test and it is true that a SIGPIPE signal and an EPIPE error are returned. We normally don't see the EPIPE error because the default action of the SIGPIPE handler is to terminate the program.

So why is there a SIGPIPE signal and an EPIPE error? I'm not sure. It has been suggested that SIGPIPE is there only to force immediate termination in sloppy programs that don't do proper error checking on the write. But, surely, there's a better reason than that since this could be true of other calls as well.

Dr Eavis
A little supplementary point 2005-03-05
I quick test again and find a little interesting thing. For those sloppy programmer, the write error wouldn't appear. That is, if the programmer forget to close the "read" end at his side, even the other process closes the "read" end there is no write error or SIGPIPE signal at all. The reason is quite interesting and also a kind of intelligent design. When a pipe is created, there is two read's and two write's. By convention one read and one write end should be closed. When the "THIRD" one is closed, it must be the read ends at the other side. Only at this moment, there is write error created at attempted writing. i.e. If the programmer is sloppy and forget to closes his "read" end, even the other process correctly closes both his read and write channel, he still doesn't get a write error. Because when the programmer's read's channel is still open, the system assumes he wants to communicate with himself. So, the good programming habit is always closes the read channel which you wouldn't use. By doing this, you will get an error when writing if the reader closes his reading end.

Good programming habit always benefits.
If you are the good-habit programmer 2005-03-05
even if the other process is a sloppy programmer who always forgets to close his "write" end before he prepares his reading, you still have chance to grasp the write error if you remember to close your "read" before writing. That is when the other process only closes his "read" without closing his "write", you will get write error when writing. Because by closing your "read" channel, you already indicate to system that you are attempted to _write_ and pipe system call will try to check if the read channle at the other end is open or not.

If you don't give enough infomation to system, system have no idea whether you want to read or write to pipe.

 

 

 

Error with pow 2005-03-05
I got this error from gcc:

/tmp/ccgWholx.o(.text+0x307): In function `Set_game_file_as_empty':

: undefined reference to `pow'

The API for the pow function is this:

#include <math.h>

double pow(double x, double y);

Obviously, I included math.h and placed non-constant double variables as parameter (and converted the return value to 'int').

Any inspirations as to the possible origins for this error?
pow problem 2005-03-05
If you see this error, it's generally because you must explicitly include the math library during the link phase. If you just have one source file this might look like:

gcc mysource.c -lm -o myfile.exe

The -lm switch (no space between them)means that gcc should include the math library when linking.

Dr Eavis

 

 

And this is for my personal reminder of debugging for another running process by using "attach" in GDB

1. When the other process ID is available either by tracing or without tracing, you can attach the process by using GDB -p [pid] [exefile].

2. It is wrong to kill the debugged process within GDB because it may closes all processes including children. Instead you should open another GDB process and debug over there. GDB is magic! 

Killing each dead player is annoying 2005-03-08
As we are all users to this Linux server "alamanni", I noticed there are increasing number of "dead player" program left in the server. And it is really an annoying pain to "kill -9" each "player.exe" one by one when you are debugging your program. My little experience is just call "atexit" function and register a simple handler which sends SIGQUIT to all children programs by specifying parameter of pid=0. Therefore whenever you shut down your "tournament.exe", you can be sure there is no dead child floating in server.
another option 2005-03-09
Another, perhaps simpler option, is to use the killall function. killall can be called from the command line and can be used to send a signal to all programs with a given name (only the ones you own). So "killall player.exe" would kill all of those processes at once.

Dr Eavis

 

STDOUT_FILENO and stdout 2005-03-07
Professor, I thought I observed something, but I am still not very sure. And I need your justification.

1. The testing example I wrote is a bit long, so I only give the idea.

2. When I use "dup2" to make writing end of pipe pointing to STDOUT_FILENO, I originally expect the file object "stdout" is also modified to direct to my pipe. But it seems not. i.e. After I called dup2(fds[1], STDOUT_FILENO) , whenever I call write(STDOUT_FILENO, buf, buf_len), I am writing to the pipe instead of original I/O. But what if I call fputs(buf, stdout)? I expected it should be same as "write" call which is directed to pipe. But it seems not.

3. If file library call like fputs etc is implemented by "write" syscall, they logically should be directed after I dup2 the STDOUT_FILENO, right?

Thank you for your time.
Sorry, forget about this stupid question 2005-03-07
With another check, I found out this is only due to the fact that I forget "fflush(stdout)" and the textbook already has such implication in the example. I am sorry to bother you.

Thanks anyway.

 

The reason for you to link "math" library is the same as "pthread". But recall that usually gcc should give "unresolved function" after linking if the implementation of a function used is not supplied. However, when you compile program which uses functions from "math.h" or "pthread.h", gcc just finishes fine without giving linking error. But the program won't run. There is two questions here. 1. Why no error is given in compiling stage? 2. Why should you have to link them by yourself?

The second question is easy since I guess these libraries are platform-specific. That is to say, different platform have its own implementation and the best way to deal with this problem is to make these library dynamic library so that these library interface can be made uniformal in different platform. But the first question I have no clue at all. I guess there must be some compiler directive suppressing gcc to link functions in dynamic library. But I don't know what it is or whether this is the answer. I noticed there is a key word "__THROW" at end of each pthread function. Is it the one? I think it is not since in "math.h" I didn't find similar one.

Thread test 1 2005-03-10
In class, some guy suggested to pass "main" as parameter in "pthread_create". Actually it works because whenever you pass as parameter is simply the address of a function, or address in "text" segment. However, gcc is also responsibly pointed out that "main" is not match with prototype of parameter by warning. And I tried to run the program, it did works. How can I be sure the thread succeed? I add printf at the beginning of "main" which prints the first argument in main.

By the way, it seems there is an upperlimit 5 for creating threads. At least this happens in linux server "alamanni" by giving error of "resource temporarily unavailable".

However, when I recompile and run the same program in "sunset" which is said to be Unix server, there is no limit at all. Only that when I try to test the limit by specifying the number of threads to be 100000, there is occasional error of "not enough space" which makes perfect sense since each thread runs up a part of 4G byte virtual memory space of process.

Based on above observation, my conclusion is that Linux threads are indeed kernal-supported threads which can be treated as a kind of hardware resource. While Unix thread is purely user-support thread in which "pthread" library is implemented purely by software.

Am I right, professor?

 

Playing with pointers...(The following stuff is purely for personal reminder. It may contain contents of ambiguity and incorrectness. View's discretion is suggested.)

Look at code following and do you really understand "pass-by-pointer"? I think I do, but I don't.
1. All variable is just an alias of memory address.
2. Pass-by-pointer just passes the address and see following prototype: void foo(char* ptr);
int main()
{
	char* onlyPtr;
	foo(onlyPtr);//here you pass a wild pointer and you are lucky if program crashes.
}
3. What compiler does is to create a temporary variable of char* which has the same value of your parameter.
i.e.    char myCh;
	foo(&myCh); //a temporary char* variable created to have value of &myCh which is an address of itself.
4. What should I say about this simple question? I sometimes boast myself as a decent C/C++ programmer whose basic
instinct is the handling of pointers. However, I found out I was confused a bit with "passing-by-pointer" mode.
Let's keep one simple thing in our mind because all others are not absolutely necessary:
There is ONE AND ONLY ONE MODE IN C, "passing-by-value", nothing more and nothing less.
5. Pay attention that a memory address is passed to the function and you can manage with the address whatever you 
like to do. When the function returns, you can retrieve the address stored in the contents in the address. I can
even pass the address of an integer which happens to have same length of an address, based on current 32bit address
architecture.
#include <stdlib.h>
 
#define PtrSize sizeof(char*)
char* createBuf(char* ptr);

int main()
{
        char* array;
        char ptr[PtrSize];
        char*temp=NULL;   
        array=createBuf(ptr);
        printf("malloc contents : %s\n", array);
        //temp=(char*)(*ptr);
        //strncpy(temp, *((char**)ptr), sizeof(char*));
        printf("and from ptr is %s\n",*((char**)ptr)); 
        
        return 0;
}
        
char* createBuf(char* ptr)
{
        char* buf;
        int i;
        buf=(char*) malloc(500*sizeof(char));
        for (i=0; i<26; i++)
        {
                buf[i]='a'+i;
        }
        buf[i]='\0';
        *((char**)ptr)=buf;
        return buf;
}
        

synchronization---revisited

Actually there is little new about this synchronization test except that I am now using Linux pthread library with conditional variable instead of busy-waiting-style of polling. It is almost exactly a re-make of Dr. Probst array increment in comp346. Now I begin to understand some of his lectures. I really regret what I commented on him before. He is a good professor and I am stupid at that time since my view was as narrow as what I could see.

Conditional waiting is really a big leap from "polling-style" busy waiting and there is one thing worth bearing in mind: you need to unlock the mutex when you exit the conditional variable because "pthread_cond_wait" implicitly acquires the mutex for you and it is always your job to unlock it. Another small issue is that you would be better to call "pthread_cond_broadcast" instead of "pthread_cond_signal" unless you are sure about the sequence of how waiting threads are woken up.

#include <pthread.h>
#include <stdlib.h>
#include <sys/types.h>
#include <time.h>
#define Max_Threads 10

pthread_t threads[Max_Threads];

pthread_cond_t cond=PTHREAD_COND_INITIALIZER;

pthread_mutex_t mutex=PTHREAD_MUTEX_INITIALIZER;

int array[Max_Threads]={0,1,2,3,4,5,6,7,8,9};

int sequence[Max_Threads]={0,1,2,3,4,5,6,7,8,9};

void* func(void* arg);

void printArray();

void randomSequence();

int main()
{
	int i, *value;
	srand(time(0));
	printArray();
	randomSequence();
	for (i=0; i<Max_Threads; i++)
	{
		if (pthread_create(&threads[sequence[i]], NULL, func,&sequence[i]))
		{
			printf("create error\n");
		}
		else
		{
			printf("threas[%d] is created\n", sequence[i]);
		}
	}
	for (i=0; i<Max_Threads; i++)
	{
		if (pthread_join(threads[i], (void**)(&value)))
		{
			printf("join error");
		}
		printf("thread no.%d return with value=%d\n", i,*value);
	}
	printf("let me show you the array\n");
	printArray();
}

void randomSequence()
{
	int i,j, hold;
	for (i=0; i<Max_Threads; i++)
	{
		j=rand()%10;
		hold=sequence[i];
		sequence[i]=sequence[j];
		sequence[j]=hold;
	}
}

void printArray()
{
	int i;
	for (i=0; i<Max_Threads; i++)
	{
		printf("array[%d]=%d,", i, array[i]);
	}
	printf("\n");
}


void* func(void* arg)
{
	int index;
	index=*((int*)arg);
	if (pthread_mutex_lock(&mutex))
	{
		printf("mutex lock error");
	}
	if (index==0)
	{
		array[index]++;
		printf("threads[0] finished\n");
		if (pthread_cond_broadcast(&cond))
		{
			printf("cond signal error");
		}
		if (pthread_mutex_unlock(&mutex))
		{
			printf("unlock eeeror");
		}
	}
	else
	{
		while (array[index-1]!=index)
		{
			if (pthread_cond_wait(&cond, &mutex))
			{
				printf("cond lock error\n");
			}
		}
		array[index]++;
		printf("threads[%d] finishes job\n", index);
		if (pthread_cond_broadcast(&cond))
		{
			printf("cond error");
		}
		if (pthread_mutex_unlock(&mutex))
		{
			printf("unlock eeeror");
		}
	}
	return &array[index];
}	
	
	

array[0]=0,array[1]=1,array[2]=2,array[3]=3,array[4]=4,array[5]=5,array[6]=6,array[7]=7,array[8]=8,array[9]=9,
threas[3] is created
threas[1] is created
threas[6] is created
threas[4] is created
threas[0] is created
threads[0] finished
threads[1] finishes job
threas[9] is created
threas[7] is created
threas[8] is created
threas[2] is created
threads[2] finishes job
threads[3] finishes job
threas[5] is created
threads[4] finishes job
thread no.0 return with value=1
threads[5] finishes job
thread no.1 return with value=2
thread no.2 return with value=3
thread no.3 return with value=4
thread no.4 return with value=5
threads[6] finishes job
threads[7] finishes job
threads[8] finishes job
thread no.5 return with value=6
thread no.6 return with value=7
threads[9] finishes job
thread no.7 return with value=8
thread no.8 return with value=9
thread no.9 return with value=10
let me show you the array
array[0]=1,array[1]=2,array[2]=3,array[3]=4,array[4]=5,array[5]=6,array[6]=7,array[7]=8,array[8]=9,array[9]=10,
 

The puzzle of sleep... 2005-03-25
Dear professor, I extended a little bit of your example in class to prove "sleep" call won't cause whole process to sleep. Basically, I modified the pthread_attr to minimize the thread stack size so that I can create maximum number of pthreads.

And the program does exactly same job in your program: create as many as possbile pthreads and assign a index number for each one, and each thread sleep the number of seconds equal to its index number. Therefore all threads just wake up one by one.

This works all fine in our lab machine and sunset. But my question is just they seems to be too much for "kernel-support-thread" because I tried 8000 threads in Linux machine and 20000 threads in "sunset" which is Unix. And I cannot persuade myself that the Unix can support as many as 20,000 "kernel-aware" threads. Even in Linux, 8000 is too much for "one-to-one" matching kernel thread. The following is the code I am using:

#include <pthread.h>

#include <stdlib.h>

#include <limits.h>

const int MaxThread=20000;

void* run(void* arg);

int main()

{

pthread_t threads[MaxThread];

pthread_attr_t attribute;

int i;

pthread_attr_init(&attribute);

//if (pthread_attr_setstacksize(&attribute,PTHREAD_STACK_MIN))

if (pthread_attr_setstacksize(&attribute,16384))

{

printf("set attribute error\n");

}

for (i=0; i<MaxThread; i++)

{

if (pthread_create(&threads[i], &attribute, run,(void*)i))

{

printf("unable create threads");

}

}

sleep(MaxThread);

return 0;

}

void* run(void* arg)

{

int index=(int)arg;

char msg[40];

sprintf(msg, "this is thread %d\n", index);

write(1, msg, strlen(msg));

sleep(index);

msg[strlen(msg)]='\0';

strcat(msg, " wake up\n");

write(1, msg, strlen(msg));

}

 
thread max 2005-03-25
Actually, we discussed this in class at some point. The maximum thread limit -particularly on Linux - is usually not very high. Without adjusting the stack size, it may be as little as a few hundred threads. If you reduce the stack size - as you've done here - you'll be able to go higher. However, the pre-compiled pthread limit (i.e., for the libraries themselves) is typically 1024. So you won't be able to go any higher than this. Even if you could (by re-compiling pthreads), you would run into user resource and/or system limits that would likely prevent you from creating any more than a few thousand threads (this would also be true of other OSes). There is no chance that you could create 20000 threads, at least not without a lot of recompilation, system administration, and perhaps more memory.

Dr Eavis
I did succeed in running 20,000 threads 2005-03-25
in "susnset.cs.concordia.ca" which is unix. My question is like this:

My program proves that each "sleep" call will NOT put whole process to be sleeping, instead only put the calling thread to sleep. This is a proof of "kernel-support" thread. But 20,000 kernel-support thread seems unbelieveable to me since I assume kernel-supported threads would be a precious resource and won't be arbitrarily as many as user requests.

Professor, can you justify this, please?

By the way, the proof of each "sleep" call only put calling thread to be asleep is exactly same as your demo example in class. Every second, there is one thread wake up.

 
high thread count 2005-03-26
It's quite unlikely that these are all kernel supoorted threads. Remember that Solaris has traditionally used a two-level, many-to-many threading model. It multiplexes user created threads on top of a potentially smaller number of kernel supported threads. This multi-plexing can be done automatically based upon the characteristics of the application. I suspect this is what you are seeing - thousands of user-level threads but a small number of kernel supported light weight processes. Unfortunately, I can't be sure since Solaris doesn't provide a lot of free information about its threading package.

Keep in mind that user levels threads do not prevent blocking calls like sleep from working properly. I said in class that it's easy to do thread specific sleep calls with kernel supported user threads since each thread is a schedulable kernel entity. However, user level threads can also provide thread specific blocking calls by providing new versions of the calls with the threading libraries (and using some kind of wrapper interface). It is complex but it can be done. So if user levels threads are being used in your example, the sleep calls may still work just fine.

Finally, keep in mind that in the latest releases of Solaris, Sun is apparently moving to the one-to-one model for threads. I've run thread creation tests on a couple of Solaris machines that I have access to and they can only create a couple of thousand threads before failing.

Dr Eavis
I also suspect they are kernel-aware 2005-03-27
threads.

1. To go extreme, I even tried 60,000 threads in "Sunset" which is Unix. And it works fine, maybe in Unix it is a "hybrid-mode" or "many-to-many" like soloris. But in Linux, about 8000 is the upper limit. Since book claims Linus is one-to-one, 8000 seems to be too many.

2. User level thread can be made thread specific when blocking calls are made. This makes senses. But the call "sleep" is actually system call and I cannot see any wrapper function over it. (maybe pthread intercepts singal handler if "sleep" is implemented by "alarm" signal.) Still it is not very clear.

3. Linux thread must be using process ID and is kernel-aware. In "alamanni" server, administrater limits the number of threads we can create. So, this is another proof that pthread is kernel-aware threads in Linux. The question is just is it possible for kernel to implement 8000 more kernel threads, based on "one-to-one model"? This is unbelieveable.

4. It seems all proof has forced me to believe Linux does support more than 8000 kernel-aware threads. Is it right, sir?

 

My conclusion for above puzzle:

I developed my own theory to explain above like this: 1. In "unix" which is actually "soloris" it is "hybrid" model. It means the kernel thread support unlimited user thread like "thread-pool". Thereby I don't have any "kernel-aware" thread number limit in Unix. Whereas in Linux, it is "one-to-one" model and I do have a number limit for creating threads.

why do I have to specify "O_RDWR"? 2005-03-30
In the example of book on page 412 which is about "mapped I/O" and is also demoed in class, when we open both source and target file we have to specify the open mode for each. It is very reasonable for "readonly" for the source file since I only need to copy contents from it. However, it is quite unexpected to specify "read&write" or "O_RDWR" for the target file since I only want to write contents into it.

I tried to use "O_WRONLY" and I always run into error when "mmap" it into memory with error message "permission denied".

Professor, is there any reason for this? Or any rationale about this "read and write" instead of "write only"?

 

I write all following testing program in a rather random style. Sometimes I just want to confirm a single idea which seems to be so trivial for most of us. And of course most of them have a rather blurred purpose and even cannot be understood by myself afterwards.

1. Broadcast test: I want to confirm that pthread_cond_broad only wakes up those threads who already associated with the conditional variable. (This is insane! Because it is already stated very clearly both in manual page and textbook. However, I give it a try.)

Also I practice the idea of multiple condition which I have an impression of being discussed by Dr. Probst. I didn't have a ready mind at that time.

#include <pthread.h>
#include <stdlib.h>


#define ThreadNumber 16

pthread_mutex_t mutex=PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t evencond=PTHREAD_COND_INITIALIZER, oddcond=PTHREAD_COND_INITIALIZER;
int choice=-1;
int counter;

void initialize();

void* run(void* arg);

int main()
{
	initialize();
	printf("sleep(2)\n");
	sleep(2);
	choice=getpid()%2;
	printf("choice=%d\n", choice);
	if (choice%2==0)
	{
		if (pthread_cond_signal(&oddcond))
		{
			perror("signal error from main");
		}			
	}
	else
	{
		if (pthread_cond_signal(&evencond))
		{
			perror("signal error from main");
		}
	}
	sleep(2);		
	return 0;
}

void* run(void* arg)
{
	//while (1)
	{
		if (pthread_mutex_lock(&mutex))
		{
			perror("mutex error");
		}
		while (choice<0)
		{
			if ((int)arg%2==0)//even
			{
				if (pthread_cond_wait(&evencond, &mutex))
				{
					perror("cond lock error");
				}
			}
			else
			{
				if (pthread_cond_wait(&oddcond, &mutex))
				{
					perror("cond error");
				}
			}
		}
		while (choice!=(int)arg%2)
		{
			printf("thread#%d is running\n", (int)arg);
			if ((int)arg%2==0)
			{
				if (pthread_cond_broadcast(&oddcond))
				{
					perror("broadcast error");
				}
				if (pthread_cond_wait(&evencond, &mutex))
				{
					perror("cond error");
				}
			}
			else
			{
				if (pthread_cond_broadcast(&evencond))
				{
					perror("broadcast error");
				}
				if (pthread_cond_wait(&oddcond, &mutex))
				{
					perror("cond error");
				}
			}
		}
		if ((int)arg%2==0)
		{
			if (pthread_cond_signal(&evencond))
			{
				perror("signal error");
			}
		}
		else
		{
			if (pthread_cond_signal(&oddcond))
			{
				perror("signal error");
			}
		}
		printf("thread#%d is finished\n", (int)arg);
		if (pthread_mutex_unlock(&mutex))
		{
			perror("mutex unlock error");
		}
		
	}
}

void initialize()
{
	int i;
	pthread_t tid;
	for (i=0; i<ThreadNumber; i++)
	{
		if (pthread_create(&tid, NULL, run, (void*)i))
		{
			perror("create error");
		}
	}
}

2. write test: In this little weird test, I want to run multiple process which has no relation at all. (i.e. not by fork) And ask them to write to same file with different contents at almost same time. Obviously the file can only be a mess.

#include <stdlib.h>
#include <signal.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>

#define BufferSize 1024
#define WriteNumber 16
#define WriteSize BufferSize/WriteNumber

char buf[BufferSize];

void initialize(char ch);
void handler(int no);

int fd, count=0;

int main(int argc, char* argv[])
{
	sigset_t maskset, zeroset;
	sigemptyset(&zeroset);
	sigemptyset(&maskset);
	sigaddset(&maskset, SIGUSR1);
	
	if (argc!=3)
	{
		printf("usage: writetest.exe filename contents");
		exit(0);
	}
	initialize(argv[2][0]);
	if (sigprocmask(SIG_BLOCK, &maskset, NULL)<0)
	{
		perror("sigprocmask error");
	}
	if ((fd=open(argv[1], O_WRONLY|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR|S_IXUSR))<0)
	{
		perror("open error");
	}
	if (lseek(fd, 0, SEEK_SET)<0)
	{
		perror("lseek error");
	}
	if (signal(SIGUSR1, handler)==SIG_ERR)
	{
		perror("signal error");
	}
	while (count<WriteNumber)
	{
		sigsuspend(&zeroset);
	}
	return 0;
}

void handler(int no)
{
	if (write(fd, buf+count*WriteSize, WriteSize)!=WriteSize)
	{
		perror("write error");
	}
	else
	{
		count++;
	}
}

void initialize(char ch)
{
	int i;
	for (i=0; i<BufferSize; i++)
	{
		buf[i]=ch+i%10;
	}
}

The above program takes two parameters: file name and symbols to be written into file.
The following is the running script:

./writetest.exe writeresult.txt 0 &

./writetest.exe writeresult.txt A &

./writetest.exe writeresult.txt a &

And another script continually sends signal to all three running processes:

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

kill -USR1 15038

kill -USR1 15037

kill -USR1 15039

sleep 1

 

3. question 8 test: In assignment, we are asked a question about mapped I/O. It is said that assume one process maps a file into memory and another process just use system write to write to the file at almost same time, what would happen? I really don't know the answer. Nor do I after this test.

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <errno.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>

#define BufferSize 512

char buf[BufferSize];

void initialize(char* fileName);
void handler(int no);

int fd;

int main(int argc, char* argv[])
{
	if (argc!=2)
	{
		printf("usage: question8.exe filename\n");
		exit(0);
	}
	initialize(argv[1]);
	return 0;
}


void initialize(char* fileName)
{
	int i;
	pid_t pid;
	char* ptr;
	char endChar=' ';
	if ((fd=open(fileName, O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR))<0)
	{
		perror("open error\n");
	}
	if ((pid=fork())==0)
	{
		for (i=0; i<BufferSize; i++)
		{
			buf[i]='0'+i%10;
		}
		if (signal(SIGUSR1, handler)==SIG_ERR)
		{
			perror("signal error\n");
		}
		pause();
	}
	else
	{
		for (i=0; i<BufferSize; i++)
		{
			buf[i]='A'+i%10;
		}

		if (lseek(fd, BufferSize-1, SEEK_SET)<0)
		{
			perror("lseek error");
		}
		if (write(fd, &endChar, 1)!=1)
		{
			perror("write end of file error\n");
		}
		if (lseek(fd, 0, SEEK_SET)<0)
		{
			perror("seek error\n");
		}	
		if ((ptr=mmap(0, BufferSize, PROT_WRITE, MAP_SHARED, fd, 0))==(caddr_t)(-1))
		{
			perror("map error\n");
		}
		sleep(1);
		if (kill(pid, SIGUSR1)<0)
		{
			perror("kill error");
		}
		memcpy(ptr, buf, BufferSize);
		munmap(ptr, BufferSize);
		if (wait(&i)<0)
		{
			perror("wait error\n");
		}
	}
}

void handler(int no)
{
	if (write(fd, buf, BufferSize)!=BufferSize)
	{
		perror("sys write error\n");
	}
}	

 

4. exit test: This is a little bit malicious program though it doesn't work. I want to register "main" in "atexit" call so that it never quits. Unfortunately my little plot for "Undead Program" doesn't work under "kill signal". I guess "kill signal" just doesn't bother to call "exit". Instead just call "_exit".

Later I want to scan all process ID to try to close them in server and of course scan all signals for each process which I can access. The only result is that I "successfully" killed myself. :) Naughty time is over.

#include <stdlib.h>
#include <signal.h>
#include <errno.h>

void handle(int no);

int main()
{
	int no;
	pid_t pid;
	for (pid=1; pid<65534; pid++)
	{
		if (kill(pid, SIGKILL)<0)
		{
			perror("won't work\n");
		}
		else
		{

			for (no=0; no<64; no++)
			{
				if (signal(no, handle)==SIG_ERR)
				{
					perror("signal error\n");
				}
			}
		}
	}
	printf("I am alive\n");
	sleep(1);
	if (atexit(main))
	{
		perror("atexit error");
	}
	return 0;
}


void handle(int no)
{
	abort();
}
 
	 signal is essentially bitmap and counting won't guarantee for all children's SIGCHLD signal

The problem is like this:

We have a parent process that generates a number of child processes. The

child processes will run for a very long time and the parent needs to continue

processing other things (the parent will run forever). We don’t want to create

zombies with these children when they finally end but we can’t just use a “fork

and wait” combination in the parent since this will prevent the parent from

doing its other processing. We could use a non-blocking waitpid call but then

we would have to keep checking (i.e., polling) to find out when the children

ended. So how can we prevent zombies (or at least get rid of them quickly) in

this situation?

Answer: Unfortunately, the wording in this question was not as precise as it

should have been. Specifically, I wanted you to provide a solution in which the

parent knew exactly when a child finished and could take some action if

necessary. But I didn’t say that so there are really two acceptable answers.

The “proper” one is below:

One of the fundamental UNIX signals is SIGCLD (or SIGCHLD). When a child

process terminates, a SIGCLD signal is automatically sent to the parent

process. Unlike most other signals, the default action is to ignore this signal.

However, we can use it to “reap” dead child processes by creating a SIGCLD

signal handler in the parent. When this signal is received, the signal handler

can simply execute a wait() call to permit the resources of the child to be

reclaimed (a truly robust version may be a little more complex than this).

Alternate answer: We can call fork twice so that the parent creates a child

and then this child creates another child. When the first child exits, the second

child is inherited by init and will be cleaned up whenever it finishes. The

original parent does a wait for its first child and reaps it as soon as it exits.

The parent can then keep on going since it has no more direct children.

My idea is like this:

About theory assignment3 no.8 2005-04-12
Professor,

I am reviewing the old assignment and think assignment3, question 8 is not perfect. Since signal essentially is just kept as bitmap in each process, accumulated signal will only be counted once. Then by "counting" signal "SIGCHLD" is not a reliable way to prevent child process from becoming zombies. i.e. when parent process is blocked or booted off from cpu for context switch, more than one child exit will only generate one SIGCHILD signal to parent process which is just one bit flipped. So, there are big chances that many died children processes actually cannot be "waited". The following is the running result of my test program and along with it is a snap from "ps -A" from other terminal to prove that many children are indeed zombies:

running result of test program, please note that only a few of died children are captured by "wait":

[root@sec05 mytest]# gcc -g sigchildtest.c -o sigchildtest.exe

[root@sec05 mytest]# ./sigchildtest.exe

child 19090 sleep 1 and dies

child 19091 sleep 1 and dies

child 19092 sleep 1 and dies

child 19093 sleep 1 and dies

child 19094 sleep 1 and dies

child 19095 sleep 1 and dies

child 19096 sleep 1 and dies

child 19097 sleep 1 and dies

child 19098 sleep 1 and dies

child 19099 sleep 1 and dies

child 19090 dies with status of 0

child 19091 dies with status of 256

child 19092 dies with status of 512

child 19093 dies with status of 768

And snap shot from "ps":

19089 pts/17 00:00:00 sigchildtest.ex

19094 pts/17 00:00:00 sigchildtest.ex

19095 pts/17 00:00:00 sigchildtest.ex

19096 pts/17 00:00:00 sigchildtest.ex

19097 pts/17 00:00:00 sigchildtest.ex

19098 pts/17 00:00:00 sigchildtest.ex

19099 pts/17 00:00:00 sigchildtest.ex

The following are the test program:
 
#include <signal.h>
#include <sys/types.h>
#include <stdlib.h>

void handler(int no);

int main()
{
	int i;
	if (signal(SIGCHLD,handler)==SIG_ERR)
	{
		printf("signal error");
	}
	for (i=0; i<10; i++)
	{
		if (fork()==0)
		{
			while (1)
			{
				printf("child %d sleep %d and dies\n", getpid(), 1);
				sleep(1);
				//if (i%2==0)
				{
					exit(i);
				}
			}		
		}
	}
	while (1)
	{
		sleep(5);
	}
	return 0;
}



void handler(int no)
{
	pid_t pid;
	int status;
	if ((pid=wait(&status))<0)
	{
		printf("wait error");
	}
	else
	{
		printf("child %d dies with status of %d\n", pid, status);
	}
}
		
wait and sigchild 2005-04-12
Your analysis is correct. However, you'll note in my answer that I said "a truly robust version may be a little more complex than this". In actual fact, we do not use a simple wait call. Instead, we use a waitpid call in a loop. The arguments to waitpid are set so that it waits for any process (not a specific one) and that it is non-blocking. If run this way, the waitpid will loop inside the signal handler, cleaning up ONE OR MORE children that have ended.

Perhaps I should have put this full explanation in the solution sheet...but I just wanted to see if students could figure out the basic idea.

Dr Eavis

The difference need to be made between c and c++

ANSI-C++ ANSI-C
<cassert> <assert.h>
<cctype> <ctype.h>
<cerrno> <errno.h>
<cfloat> <float.h>
<ciso646> <iso646.h>
<climits> <limits.h>
<clocale> <locale.h>
<cmath> <math.h>
<csetjmp> <setjmp.h>
<csignal> <signal.h>
<cstdarg> <stdarg.h>
<cstddef> <stddef.h>
<cstdio> <stdio.h>
<cstdlib> <stdlib.h>
<cstring> <string.h>
<ctime> <time.h>
<cwchar> <wchar.h>
<cwtype> <wtype.h>

 

                                                       back.gif (341 bytes)       up.gif (335 bytes)         next.gif (337 bytes)