~ubuntu-kernel/ubuntu/+source/linux/+git/natty:master

Last commit made on 2012-09-06
Get this branch:
git clone -b master https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/natty
Members of Ubuntu Kernel Repositories can upload to this branch. Log in for directions.

Branch merges

Branch information

Recent commits

6c417a4... by Luis Henriques

UBUNTU: Ubuntu-2.6.38-16.67

Signed-off-by: Luis Henriques <email address hidden>

dbecd07... by Oleg Nesterov <email address hidden>

cred: copy_process() should clear child->replacement_session_keyring

CVE-2012-2745

BugLink: http://bugs.launchpad.net/bugs/1023535

keyctl_session_to_parent(task) sets ->replacement_session_keyring,
it should be processed and cleared by key_replace_session_keyring().

However, this task can fork before it notices TIF_NOTIFY_RESUME and
the new child gets the bogus ->replacement_session_keyring copied by
dup_task_struct(). This is obviously wrong and, if nothing else, this
leads to put_cred(already_freed_cred).

change copy_creds() to clear this member. If copy_process() fails
before this point the wrong ->replacement_session_keyring doesn't
matter, exit_creds() won't be called.

Cc: <email address hidden>
Signed-off-by: Oleg Nesterov <email address hidden>
Acked-by: David Howells <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>
(cherry picked from commit 79549c6dfda0603dba9a70a53467ce62d9335c33)

Signed-off-by: Tim Gardner <email address hidden>
Acked-by: Brad Figg <email address hidden>

1cd9ff0... by Andrew Lutomirski

mm: Hold a file reference in madvise_remove

CVE-2012-3511

BugLink: http://bugs.launchpad.net/bugs/1042447

Otherwise the code races with munmap (causing a use-after-free
of the vma) or with close (causing a use-after-free of the struct
file).

The bug was introduced by commit 90ed52ebe481 ("[PATCH] holepunch: fix
mmap_sem i_mutex deadlock")

Cc: Hugh Dickins <email address hidden>
Cc: Miklos Szeredi <email address hidden>
Cc: Badari Pulavarty <email address hidden>
Cc: Nick Piggin <email address hidden>
Cc: <email address hidden>
Signed-off-by: Andy Lutomirski <email address hidden>
Signed-off-by: Linus Torvalds <email address hidden>
(back ported from commit 9ab4233dd08036fe34a89c7dc6f47a8bf2eb29eb)
Acked-by: Herton Ronaldo Krzesinski <email address hidden>
Acked-by: Brad Figg <email address hidden>
Signed-off-by: Tim Gardner <email address hidden>

ce862bd... by Ben Hutchings (Solarflare)

sfc: Fix maximum number of TSO segments and minimum TX queue size

CVE-2012-3412

BugLink: http://bugs.launchpad.net/bugs/1037456

Currently an skb requiring TSO may not fit within a minimum-size TX
queue. The TX queue selected for the skb may stall and trigger the TX
watchdog repeatedly (since the problem skb will be retried after the
TX reset). This issue is designated as CVE-2012-3412.

Set the maximum number of TSO segments for our devices to 100. This
should make no difference to behaviour unless the actual MSS is less
than about 700. Increase the minimum TX queue size accordingly to
allow for 2 worst-case skbs, so that there will definitely be space
to add an skb after we wake a queue.

To avoid invalidating existing configurations, change
efx_ethtool_set_ringparam() to fix up values that are too small rather
than returning -EINVAL.

Signed-off-by: Ben Hutchings <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(back ported from commit 7e6d06f0de3f74ca929441add094518ae332257c)

Signed-off-by: Tim Gardner <email address hidden>
Acked-by: Herton Krzesinski <email address hidden>

7ae3b27... by Ben Hutchings (Solarflare)

sfc: Replace some literal constants with EFX_PAGE_SIZE/EFX_BUF_SIZE

CVE-2012-3412

BugLink: http://bugs.launchpad.net/bugs/1037456

The 'page size' for PCIe DMA, i.e. the alignment of boundaries at
which DMA must be broken, is 4KB. Name this value as EFX_PAGE_SIZE
and use it in efx_max_tx_len(). Redefine EFX_BUF_SIZE as
EFX_PAGE_SIZE since its value is also a result of that requirement,
and use it in efx_init_special_buffer().

Signed-off-by: Ben Hutchings <email address hidden>
(back ported from commit 5b6262d0ccf759a16fabe11d904a2531125a4b71)

Signed-off-by: Tim Gardner <email address hidden>
Acked-by: Brad Figg <email address hidden>
Acked-by: Herton Krzesinski <email address hidden>

467163b... by Ben Hutchings (Solarflare)

tcp: Apply device TSO segment limit earlier

CVE-2012-3412

BugLink: http://bugs.launchpad.net/bugs/1037456

Cache the device gso_max_segs in sock::sk_gso_max_segs and use it to
limit the size of TSO skbs. This avoids the need to fall back to
software GSO for local TCP senders.

Signed-off-by: Ben Hutchings <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(back ported from commit 1485348d2424e1131ea42efc033cbd9366462b01)
Acked-by: Herton Krzesinski <email address hidden>
Signed-off-by: Tim Gardner <email address hidden>

1d25bb6... by Neal Cardwell <email address hidden>

tcp: do not scale TSO segment size with reordering degree

CVE-2012-3412

BugLink: http://bugs.launchpad.net/bugs/1037456

Since 2005 (c1b4a7e69576d65efc31a8cea0714173c2841244)
tcp_tso_should_defer has been using tcp_max_burst() as a target limit
for deciding how large to make outgoing TSO packets when not using
sysctl_tcp_tso_win_divisor. But since 2008
(dd9e0dda66ba38a2ddd1405ac279894260dc5c36) tcp_max_burst() returns the
reordering degree. We should not have tcp_tso_should_defer attempt to
build larger segments just because there is more reordering. This
commit splits the notion of deferral size used in TSO from the notion
of burst size used in cwnd moderation, and returns the TSO deferral
limit to its original value.

Signed-off-by: Neal Cardwell <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(cherry picked from commit 6b5a5c0dbb11dcff4e1b0f1ef87a723197948ed4)
Acked-by: Herton Krzesinski <email address hidden>
Signed-off-by: Tim Gardner <email address hidden>

ad4f5bb... by Ben Hutchings (Solarflare)

net: Allow driver to limit number of GSO segments per skb

CVE-2012-3412

BugLink: http://bugs.launchpad.net/bugs/1037456

A peer (or local user) may cause TCP to use a nominal MSS of as little
as 88 (actual MSS of 76 with timestamps). Given that we have a
sufficiently prodigious local sender and the peer ACKs quickly enough,
it is nevertheless possible to grow the window for such a connection
to the point that we will try to send just under 64K at once. This
results in a single skb that expands to 861 segments.

In some drivers with TSO support, such an skb will require hundreds of
DMA descriptors; a substantial fraction of a TX ring or even more than
a full ring. The TX queue selected for the skb may stall and trigger
the TX watchdog repeatedly (since the problem skb will be retried
after the TX reset). This particularly affects sfc, for which the
issue is designated as CVE-2012-3412.

Therefore:
1. Add the field net_device::gso_max_segs holding the device-specific
   limit.
2. In netif_skb_features(), if the number of segments is too high then
   mask out GSO features to force fall back to software GSO.

Signed-off-by: Ben Hutchings <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(cherry picked from commit 30b678d844af3305cda5953467005cebb5d7b687)
Acked-by: Herton Krzesinski <email address hidden>
Signed-off-by: Tim Gardner <email address hidden>

d1cc3e9... by Tyler Hicks

eCryptfs: Initialize empty lower files when opening them

BugLink: http://bugs.launchpad.net/bugs/911507

Historically, eCryptfs has only initialized lower files in the
ecryptfs_create() path. Lower file initialization is the act of writing
the cryptographic metadata from the inode's crypt_stat to the header of
the file. The ecryptfs_open() path already expects that metadata to be
in the header of the file.

A number of users have reported empty lower files in beneath their
eCryptfs mounts. Most of the causes for those empty files being left
around have been addressed, but the presence of empty files causes
problems due to the lack of proper cryptographic metadata.

To transparently solve this problem, this patch initializes empty lower
files in the ecryptfs_open() error path. If the metadata is unreadable
due to the lower inode size being 0, plaintext passthrough support is
not in use, and the metadata is stored in the header of the file (as
opposed to the user.ecryptfs extended attribute), the lower file will be
initialized.

The number of nested conditionals in ecryptfs_open() was getting out of
hand, so a helper function was created. To avoid the same nested
conditional problem, the conditional logic was reversed inside of the
helper function.

https://launchpad.net/bugs/911507

Signed-off-by: Tyler Hicks <email address hidden>
Cc: John Johansen <email address hidden>
Cc: Colin Ian King <email address hidden>
(backport from upstream commit e3ccaa9761200952cc269b1f4b7d7bb77a5e071b)
Signed-off-by: Colin Ian King <email address hidden>
Acked-by: Seth Forshee <email address hidden>
Signed-off-by: Tim Gardner <email address hidden>

b449827... by Weiping Pan <email address hidden>

rds: set correct msg_namelen

CVE-2012-3430

BugLink: http://bugs.launchpad.net/bugs/1031112

Jay Fenlason (<email address hidden>) found a bug,
that recvfrom() on an RDS socket can return the contents of random kernel
memory to userspace if it was called with a address length larger than
sizeof(struct sockaddr_in).
rds_recvmsg() also fails to set the addr_len paramater properly before
returning, but that's just a bug.
There are also a number of cases wher recvfrom() can return an entirely bogus
address. Anything in rds_recvmsg() that returns a non-negative value but does
not go through the "sin = (struct sockaddr_in *)msg->msg_name;" code path
at the end of the while(1) loop will return up to 128 bytes of kernel memory
to userspace.

And I write two test programs to reproduce this bug, you will see that in
rds_server, fromAddr will be overwritten and the following sock_fd will be
destroyed.
Yes, it is the programmer's fault to set msg_namelen incorrectly, but it is
better to make the kernel copy the real length of address to user space in
such case.

How to run the test programs ?
I test them on 32bit x86 system, 3.5.0-rc7.

1 compile
gcc -o rds_client rds_client.c
gcc -o rds_server rds_server.c

2 run ./rds_server on one console

3 run ./rds_client on another console

4 you will see something like:
server is waiting to receive data...
old socket fd=3
server received data from client:data from client
msg.msg_namelen=32
new socket fd=-1067277685
sendmsg()
: Bad file descriptor

/***************** rds_client.c ********************/

int main(void)
{
 int sock_fd;
 struct sockaddr_in serverAddr;
 struct sockaddr_in toAddr;
 char recvBuffer[128] = "data from client";
 struct msghdr msg;
 struct iovec iov;

 sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
 if (sock_fd < 0) {
  perror("create socket error\n");
  exit(1);
 }

 memset(&serverAddr, 0, sizeof(serverAddr));
 serverAddr.sin_family = AF_INET;
 serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
 serverAddr.sin_port = htons(4001);

 if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
  perror("bind() error\n");
  close(sock_fd);
  exit(1);
 }

 memset(&toAddr, 0, sizeof(toAddr));
 toAddr.sin_family = AF_INET;
 toAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
 toAddr.sin_port = htons(4000);
 msg.msg_name = &toAddr;
 msg.msg_namelen = sizeof(toAddr);
 msg.msg_iov = &iov;
 msg.msg_iovlen = 1;
 msg.msg_iov->iov_base = recvBuffer;
 msg.msg_iov->iov_len = strlen(recvBuffer) + 1;
 msg.msg_control = 0;
 msg.msg_controllen = 0;
 msg.msg_flags = 0;

 if (sendmsg(sock_fd, &msg, 0) == -1) {
  perror("sendto() error\n");
  close(sock_fd);
  exit(1);
 }

 printf("client send data:%s\n", recvBuffer);

 memset(recvBuffer, '\0', 128);

 msg.msg_name = &toAddr;
 msg.msg_namelen = sizeof(toAddr);
 msg.msg_iov = &iov;
 msg.msg_iovlen = 1;
 msg.msg_iov->iov_base = recvBuffer;
 msg.msg_iov->iov_len = 128;
 msg.msg_control = 0;
 msg.msg_controllen = 0;
 msg.msg_flags = 0;
 if (recvmsg(sock_fd, &msg, 0) == -1) {
  perror("recvmsg() error\n");
  close(sock_fd);
  exit(1);
 }

 printf("receive data from server:%s\n", recvBuffer);

 close(sock_fd);

 return 0;
}

/***************** rds_server.c ********************/

int main(void)
{
 struct sockaddr_in fromAddr;
 int sock_fd;
 struct sockaddr_in serverAddr;
 unsigned int addrLen;
 char recvBuffer[128];
 struct msghdr msg;
 struct iovec iov;

 sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
 if(sock_fd < 0) {
  perror("create socket error\n");
  exit(0);
 }

 memset(&serverAddr, 0, sizeof(serverAddr));
 serverAddr.sin_family = AF_INET;
 serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
 serverAddr.sin_port = htons(4000);
 if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
  perror("bind error\n");
  close(sock_fd);
  exit(1);
 }

 printf("server is waiting to receive data...\n");
 msg.msg_name = &fromAddr;

 /*
  * I add 16 to sizeof(fromAddr), ie 32,
  * and pay attention to the definition of fromAddr,
  * recvmsg() will overwrite sock_fd,
  * since kernel will copy 32 bytes to userspace.
  *
  * If you just use sizeof(fromAddr), it works fine.
  * */
 msg.msg_namelen = sizeof(fromAddr) + 16;
 /* msg.msg_namelen = sizeof(fromAddr); */
 msg.msg_iov = &iov;
 msg.msg_iovlen = 1;
 msg.msg_iov->iov_base = recvBuffer;
 msg.msg_iov->iov_len = 128;
 msg.msg_control = 0;
 msg.msg_controllen = 0;
 msg.msg_flags = 0;

 while (1) {
  printf("old socket fd=%d\n", sock_fd);
  if (recvmsg(sock_fd, &msg, 0) == -1) {
   perror("recvmsg() error\n");
   close(sock_fd);
   exit(1);
  }
  printf("server received data from client:%s\n", recvBuffer);
  printf("msg.msg_namelen=%d\n", msg.msg_namelen);
  printf("new socket fd=%d\n", sock_fd);
  strcat(recvBuffer, "--data from server");
  if (sendmsg(sock_fd, &msg, 0) == -1) {
   perror("sendmsg()\n");
   close(sock_fd);
   exit(1);
  }
 }

 close(sock_fd);
 return 0;
}

Signed-off-by: Weiping Pan <email address hidden>
Signed-off-by: David S. Miller <email address hidden>
(cherry picked from commit 06b6a1cf6e776426766298d055bb3991957d90a7)
Signed-off-by: Brad Figg <email address hidden>
Signed-off-by: Tim Gardner <email address hidden>