Wait, wait, how to do this?

Wednesday, May 01, 2013

Copy constructor and return value optimization

#include <stdio.h>

#include <string>

using std::string;

class MyString {
 public:
  MyString(const MyString& rhs) : value_(rhs.value()) {
    printf("copy ctor\n");
  }

  MyString(const string& value) : value_(value) {
    printf("ctor\n");
  }

  MyString& operator=(const MyString& rhs) {
    printf("assignment ctor\n");
    if (this != &rhs) {
      value_ = rhs.value();
    }
    return *this;
  }

  ~MyString() {
  }

  const string& value() const {
    return value_;
  }

  void set_value(const string& value) {
    value_ = value;
  }

 private:
  string value_;
};

MyString method1() {
  MyString my_string1 = MyString("true");
  return my_string1;
}

MyString method2(bool option) {
  MyString my_string1 = MyString("true");
  MyString my_string2 = MyString("false");
  return option ? my_string1 : my_string2;
}

MyString method3(bool option) {
  MyString my_string(option ? "true" : "false");
  return my_string;
}

MyString method4(bool option) {
  MyString my_string("false");
  my_string.set_value(option ? "true" : "false");
  return my_string;
}

int main(int argc, char* argv[]) {
  printf("calling method1\n");
  MyString my_string1 = method1();

  printf("\ncalling method2\n");
  MyString my_string2 = method2(true);

  printf("\ncalling method3\n");
  MyString my_string3 = method3(true);

  printf("\ncalling method4\n");
  MyString my_string4 = method4(true);

  return 0;
}

Let's compile it.
g++ -o rvo rvo.cc

Now run:
./rvo
calling method1
ctor

calling method2
ctor
ctor
copy ctor

calling method3
ctor

calling method4
ctor

Notice that line 64 may incur a call to the copy ctor as the compiler would need to make a temporary instance of MyString and copy the instance created on line 41 to that temporary variable. But as you can see from the execution results, there is no call to the copy constructor. In other words, the call to copy constructor is optimized away.

Whenever the compiler sees a temporary object being returned and assigned to another object of the same class, it tries to optimize away copy constructors and assignment constructors, which is what happened to method1/3/4. What happened to method2 is that there are two temporary objects and the compiler can not tell which one will be returned so the copy constructor has to be run.

Let's make it show up by disabling rvo.
g++ -o rvo rvo.cc -fno-elide-constructors

Run again:
calling method1
ctor
copy ctor
copy ctor
copy ctor

calling method2
ctor
copy ctor
ctor
copy ctor
copy ctor
copy ctor

calling method3
ctor
copy ctor
copy ctor

calling method4
ctor
copy ctor
copy ctor

Now copy constructor is called.

Implication: Even if the copy constructor has user visible effect, it can be optimized away. So do not rely on it. Do a copy if you need a copy.

Friday, October 02, 2009

Binary search vs linear search

Isn't the answer simply O(log N) vs O(N)? Let's back off and check their assumptions first.

To do a binary search, the items have to be sorted, which probably costs you O(N log N), while linear search does not come with such cost.

Secondly, the runtime cost O(log N) or O(N) is for finding a particular item. What happens if you want to add/remove an item to an existing list?

For linear search the insertion costs you O(1), since you do not care about the order. Deletion is O(N) because you need to locate the item first and then with a linked list implementation, it requires setting the pointers properly, which is a constant cost, while with an array implementation, it requires either marking that position as invalid, which will defeat the O(N) in the following operations, or shifting other items following the item you have deleted/inserted, which is O(N).

For binary search, it is tricky. With an array based implementation, both insertion and deletion will cost you O(N) as what we have seen in the linear search. You may do the marking for deletions while it will defeat the O(N) search cost if deletion is frequent. How about using a link list? A solution might come to you quickly is that a link list allows O(1) insertion/deletion once you find the item.But how can you achieve the O(log N) search cost with a linked list? A link list does not give you O(1) random access.

So when insertion/deletion operations are rare and the initial sorting cost is cheap compared to the number of search operations you will perform, binary search is the way to go. When you have a small number of items and the insertion/deletion operations are frequent, linear search does a better job. Similarly, when implementing sorting algorithms such as quick sort, insertion sort is often used when it comes to sorting a small number of items.

Sunday, August 09, 2009

What's the difference between auto, register, and volatile variables in C/C++

In Section 7.10 of APUE, there is an example to show the difference between these three types of variables, aka Program 7.5, which I found to be really confusing when I first came across it.

In a nutshell, a register variable should be stored in a register and a volatile variable should be kept in the memory. The "volatile" qualifier tells the compiler not to optimize that variable, i.e. leaving it as it is. Here is a more straight forward example to demonstrate their differences.


 1: #include <setjmp.h>
 2: #include <stdio.h>
 3: #include <stdlib.h>
 4:
 5: static jmp_buf jmpbuffer;
 6:
 7: int main(int argc, char *argv[]) {
 8:     int a;
 9:     register b;
10:     volatile c;
11:
12:     if (argc < 4) {
13:         printf("usage: %s a b c\n", argv[0]);
14:         exit(1);
15:     }
16:
17:     a = atoi(argv[1]);
18:     b = atoi(argv[2]);
19:     c = atoi(argv[3]);
20:
21:     printf("Before longjmp: a = %d, b = %d, c = %d\n", a, b, c);
22:     if (setjmp(jmpbuffer) != 0) {
23:         printf("After longjmp: a = %d, b = %d, c = %d\n", a, b, c);
24:         return 0;
25:     }
26:
27:     a++;
28:     b++;
29:     c++;
30:     longjmp(jmpbuffer, 1);
31:     return 0;
32: }

This small piece of code takes three integers as input, increments them and then calls a longjmp(). The value of these three variables are printed before and after the longjmp() call to show what is going on.

First, compile the code:
gcc test_jmp.c -o test_jmp

Then run the code:
./test_jmp 1 2 3

which prints the results:
Before longjmp: a = 1, b = 2, c = 3
After longjmp: a = 2, b = 3, c = 4

Not get it? What you see here is that after the longjmp(), the changes made to these three variables are kept although the code "jumps" to the place where you called setjmp(). But this is only the first half of the story. Let's make gcc optimize the code.

Compile the code again:
gcc test_jmp.c -o test_jmp -O

Run the code with the same inputs:
./test_jmp 1 2 3

which gives you the following results:
Before longjmp: a = 1, b = 2, c = 3
After longjmp: a = 1, b = 2, c = 4

What happened? a and b are restored to their values when setjmp() is called, i.e. the changes made to them after setjmp() disappeared. Only the volatile variable c does not roll back.

Myth dispelled: When you compile the code without optimization, all the variables are kept in memory. The register qualifier of variable b is ignored. When you compile it with optimization, both the auto variable a and register variable b are kept in registers and setjmp/longjmp is required to restore all the values stored in registers. For the volatile variable c, since we explicitly told the compiler not to optimize it, it stays in the memory thus all the changes made to it are kept after the longjmp().

Another note is that the auto variable a may not be pushed to the register, because its behavior is implementation dependent although the gcc we used in the above example does it. Therefore if you really want to make sure a variable is stored in a register, you have to use the "register" qualifier.

What's wrong with the example in APUE, i.e. program 7.5? Although you observe similar outputs, the things under the hood are different today. Compilers are very aggressive to make various optimizations. In program 7.5, the variables get assigned with values in the code (known at compile time) instead of user input (known at runtime). Therefore when you ask for optimization, the compiler goes ahead to replace the variables in the output code with their values directly. In other words, there is no register/memory storage issue in that particular example any more. Compilers have advanced a lot since when APUE was written.

Sunday, July 05, 2009

How to setup your ssh keys

Suppose that you have access to multiple machines. Of course you do not want to type your password each time you login to them remotely because you are as lazy as me. Once upon a time, you may login to any of them remotely by adding your local host to the .rhosts file and using rlogin. Soon you find out that rlogin is not secure, thus you have to switch to ssh. Now you have the same choice as .rhosts, which is .shosts. Unfortunately, most ssh servers disable looking up .shosts file by default, since it may easily become a security hole. Why? Because the .shosts file authenticates a host instead of a user. Any malicious user from a host on that list may get access, therefore a highly protected machine may be compromised as the consequences of another less protected machine being compromised. Plus, the host based solution limits your mobility. You want to access these machines when you are traveling, don't you?

So what do you do? Setup ssh keys and make the authentication happen at per user basis. Your ssh agent will try to read the key in your home directory and use it to challenge and respond to the remote machines. The keys are as safe as anything else in your home directory, if you set proper permissions.

Details:

1. Generate the key pairs:

ssh-keygen -t rsa

You have the option of generating rsa1/rsa/dsa keys, if you know the difference. In this cas, we simply go with rsa.

2. Add the public key to the machines you want to login.

scp .ssh/id_rsa.pub your_username@target_machine:tmp_key
ssh your_username@target_machine "cat tmp_key >> .ssh/authorized_keys"

3. Let your ssh agent know your private key at your local machine.

ssh-agent sh -c 'ssh-add < /dev/null && bash'

4. Try it!

ssh your_username@target_machine

Friday, June 26, 2009

How to create a tempfile

Problem: Your code needs to create a file (and perform write/read operations on it probably) during its execution. In many cases, there are two requirements.
1. It should be unique, which means that you do not want to overwrite any existing files. Sometimes you even want exclusive access to that file to avoid race conditions.
2. The file is no longer needed when the program is done, thus you need to clean up after yourself.

Solution: The problem is similar to the malloc() and free() problem, therefore it is not surprising to see a solution similar to alloca() and smart pointers. A tempfile is a file which will be automatically deleted when it is closed or the program terminates.

Short answer: tmpfile() is the way to go. It confirms to a bunch of standards, including 4.3 BSD/C89/C99/POSIX. When you do not want to have the file automatically deleted, use mkstemp() instead, which also gives you some control over the file name.

Long story:
mkstemp() and mktemp() will create a temporary file while DO NOT delete it automatically. Besides, on the manual page of mktemp(), it says "never use mktemp()" and suggests mkstemp() to avoid a race condition.

You may also use mkdtemp() to create a temporary directory. Again, the caveat is that mkdtemp() is similar to mktemp(), i.e. the directory does not get garbage collected automatically.

tempnam() and tmpname() create a name for a temporary file, which does not conflict with existing files. The catch is that on their manual pages, you will see the "never use this function" label and mkstemp() and tmpfile() are suggested here again.

Tips:
1. Always read the notes and bugs sections on the manual pages.
2. Check out related functions in the "see also" section on the manual pages. There may be a better choice among them.

Revived!

Finally I reclaimed my blog. The username was simply yeshao but I tried with my previous email account which expired three years ago. I tried to reclaim my blog but blogger's policy does not allow me to do that besides sending a new password to my non-exist email account.

Today I noticed the error message when I failed to login was "username not found" instead of wrong password. Then I told myself that I should try with my favorate username and it worked after a few password tests.

The blog was created in 2003, probably before blogspot was acquired by Google. Now I have the option to hook up the blog with my gmail account, replacing the expired one. Yay!

I am gonna use this blog for technical posts and keep my msn/live space personal.

So glad that I get my blog back!

Update: blogspot was acquired by Google in Feb 2003 so my account was created after the acquisition but at that time Google had not integrated all its products with Google account. Actually gmail was released on April 1, 2004 which later backed up Google account.

Update 2: The next thing to reclaim is an email account which I do not know the answer to the security question yet.

Saturday, July 05, 2003

hehe,yeshao's blog is online now!